DMD Source Archive - Why?

Wed Apr 10 03:47:30 UTC 2024

On 4/9/2024 4:42 PM, Steven Schveighoffer wrote:
> I will also bet that any difference in compile time will be extremely 
> insignificant. I don't bet against decades of filesystem read optimizations. 
> Saving e.g. microseconds on a 1.5 second build isn't going to move the needle.

On my timing on compiling hello world, a 1.412s build becomes 1.375s, 35 
milliseconds faster. Most of the savings appear to be due to when the archive is 
first accessed, its table of contents is loaded into the path cache and file 
cache that you developed. Then, no stats are done on the filesystem.

> I did reduce stats semi-recently for DMD and saved a significant percentage of 
> stats, I don't really think it saved insane amounts of time. It was more of a 
> "oh, I thought of a better way to do this". I think at the time, there was some 
> resistance to adding more stats to the compiler due to the same misguided 
> optimization beliefs, and so I started looking at it. If reducing stats by 90% 
> wasn't significant, reducing them again likely isn't going to be noticed.
> 
> See https://github.com/dlang/dmd/pull/14582

Nice. I extended it so files in an archive are tracked.

> The only benefit I might see in this is to *manage* the source as one item.

The convenience of being able to distribute a "header only" library as one file 
may be significant. I've always liked things that didn't need an installation 
program. An install should be "copy the file onto your system" and uninstall 
should be "delete the file" !

Back in the days of CD software, my compiler was set up so no install was 
necessary, just put the CD in the drive and run it. You didn't even have to set 
the environment variables, as the compiler would look for its files relative to 
where the executable file was (argv[0]). You can see vestiges of that still in 
today's dmd.

Of course, to get it to run faster you'd XCOPY it onto the hard drive. Though 
some users were flummoxed by the absence of INSTALL.EXE and I'd have to explain 
how to use XCOPY.

> But 
> I don't really know that we need a new custom format. `tar` is pretty simple. 
> ARSD has a tar implementation that I lifted for my raylib-d installer which 
> allows reading tar files with about [100 lines of 
> code](https://github.com/schveiguy/raylib-d/blob/9906279494f1f83b2c4c9550779d46962af7c342/install/source/app.d#L22-L132).

Thanks for the code.

A tar file is serial, meaning one has to read the entire file to see what it is 
in it (because it was designed for tape systems where data is simply appended).

The tar file doesn't have a table of contents, the filename is limited to 100 
characters, and the path is limited to 155 characters.

Sar files have a table of contents at the beginning, and unlimited filespec sizes.

P.S. the code that actually reads the .sar file is about 20 lines! (Excluding 
checking for corrupt files, and the header structure definition.) The archive 
reader and writer can be encapsulated in a separate module, so anyone can 
replace it with a different format.