What is the compilation model of D?

Jacob Carlborg doob at me.com
Thu Jul 26 00:28:21 PDT 2012


On 2012-07-25 21:54, David Piepgrass wrote:
> Thanks for the very good description, Nick! So if I understand
> correctly, if
>
> 1. I use an "auto" return value or suchlike in a module Y.d
> 2. module X.d calls this function
> 3. I call "dmd -c X.d" and "dmd -c Y.d" as separate steps
>
> Then the compiler will have to fully parse Y twice and fully analyze the
> Y function twice, although it generates object code for the function
> only once. Right? I wonder how smart it is about not analyzing things it
> does not need to analyze (e.g. when Y is a big module but X only calls
> one function from it - the compiler has to parse Y fully but it should
> avoid most of the semantic analysis.)

Yes, I think that's correct. But if you give the compiler all the source 
code at once it should only need to parse a given module only once. D 
doesn't use textual includes like C/C++ does, it just symbolically 
refers to other symbols (or something like that).

> What about templates? In C++ it is a problem that the compiler will
> instantiate templates repeatedly, say if I use vector<string> in 20
> source files, the compiler will generate and store 20 copies of
> vector<string> (plus 20 copies of basic_string<char>, too) in object files.
>
> 1. So in D, if I compile the 20 sources separately, does the same thing
> happen (same collection template instantiated 20 times with all 20
> copies stored)?

If you compile them separately I think so, yes. How would it otherwise 
work, store some info between compile runs?

> 2. If I compile the 20 sources all together, I guess the template would
> be instantiated just once, but then which .obj file does the
> instantiated template go in?

I think it only need to instantiate it once. If it does that or not, I 
don't know. About the object file, that is probably unspecified. 
Although if you compile with the -lib flag it will output the templates 
to all object files. This is one of the problems making it hard to 
create an incremental build system for D.


> I figure as CTFE is used more, especially when it is used to decide
> which template overloads are valid or how a mixin will behave, this will
> slow down the compiler more and more, thus making incremental builds
> more important. A typical example would be a compile-time
> parser-generator, or compiled regexes.

I think that's correct. I did some simple benchmarking comparing 
different uses of string mixins in Derelict. It turns out that it's a 
lot better to have few string mixins containing a lot of code then many 
string mixins containing very little code. I suspect other meta 
programming features (CTFE, templates, static if, mixins) could behave 
in a similar way.

> Plus, I've heard some people complaining that the compiler uses over 1
> GB RAM, and splitting up compilation into parts might help with that.

Yeah, I just run in to a compiler bug (not been able to create a simple 
test case) where it consumed around 3.5 GB of memory then just crashed 
after a while.

> BTW, I think I heard the compiler uses multithreading to speed up the
> build, is that right?

Yes, I'm pretty sure it reads all (many) the files in concurrently or in 
parallel. It probably can lex and parse in parallel as well, don't know 
if it does that though.


> Anyway, I can't even figure out how to enumerate the members of a module
> A; __traits(allMembers, A) causes "Error: import Y has no members".

Currently there's a bug which forces you to put the module in a package, 
try:

module foo.A;

__traits(allMembers, foo.A);

-- 
/Jacob Carlborg


More information about the Digitalmars-d mailing list