Adding ccache-like output caching to dmd

John Colvin john.loughran.colvin at gmail.com
Tue Dec 29 16:43:33 UTC 2020


On Tuesday, 29 December 2020 at 12:49:45 UTC, Stefan Koch wrote:
> On Monday, 28 December 2020 at 23:14:02 UTC, Per Nordlöw wrote:
>> Has anyone considered integrating into a `dmd` a ccache-like 
>> caching of output files indexed by digests based on
>>
>> - environment variables,
>> - process arguments which, in turn, decide
>> - input file contents (including import files detected upon 
>> first uncached compile)
>> - dmd compiler binary fingerprint
>> - ...probably something more I missed
>>
>> Initial call stores that list alongside content hash and 
>> resulting binary(s).
>>
>> If not, would anyone have any strong objections against adding 
>> this?
>
> The issue is that because of string imports you don't know the 
> full set of files you are depending on.
> which means any change can cause any file to be required.

In general it's unknown what files a given D build depends on 
until after the build has (mostly) happened. This is true for 
string imports, but also for regular imports.

Conceptually we split inputs in to:

Y: inputs knowable only after compilation is done (set of the 
contents of all imported files, string or code)
X: inputs known ahead of time (e.g. the command line flags to 
DMD).

Object files are O.

The set of file names containing Y are referred to by S.

Compiler is then a pure function F(X, Y) -> O.

Real compiler invocation is C(X, [Y]) -> O where [Y] means Y is 
implicit.

But the compiler can give us S, so we can instead say compiler is 
C(X, [Y]) -> (O, S).

The only way S will change is if X or Y change.


It (roughly :-p ) follows that we can build a persistent nested 
map Hash(X) -> ((S, Hash(Y)) -> O).

We calculate Hash(X) before compiling and look up in the map to 
get (S, Hash(Y)). If it's not there then you need to recompile 
and store a new entry in the outer map. If it is, then read all 
the files in S and use that to calculate Hash(Y)', if Hash(Y)' == 
Hash(Y) then proceed to get O, else recompile and store a new 
entry in the inner map.

Or something like that, you get the idea... It's not intractable, 
it's just a bit fiddly.


More information about the Digitalmars-d mailing list