What is the compilation model of D?

David Piepgrass qwertie256 at gmail.com
Wed Jul 25 12:54:29 PDT 2012


Thanks for the very good description, Nick! So if I understand 
correctly, if

1. I use an "auto" return value or suchlike in a module Y.d
2. module X.d calls this function
3. I call "dmd -c X.d" and "dmd -c Y.d" as separate steps

Then the compiler will have to fully parse Y twice and fully 
analyze the Y function twice, although it generates object code 
for the function only once. Right? I wonder how smart it is about 
not analyzing things it does not need to analyze (e.g. when Y is 
a big module but X only calls one function from it - the compiler 
has to parse Y fully but it should avoid most of the semantic 
analysis.)

What about templates? In C++ it is a problem that the compiler 
will instantiate templates repeatedly, say if I use 
vector<string> in 20 source files, the compiler will generate and 
store 20 copies of vector<string> (plus 20 copies of 
basic_string<char>, too) in object files.

1. So in D, if I compile the 20 sources separately, does the same 
thing happen (same collection template instantiated 20 times with 
all 20 copies stored)?
2. If I compile the 20 sources all together, I guess the template 
would be instantiated just once, but then which .obj file does 
the instantiated template go in?

> $rdmd --build-only (any other flags) main.d
>
> Then, RDMD will figure out *all* of the source files needed 
> (using
> the full compiler's frontend, so it never gets fooled into 
> missing
> anything), and if any of them have been changed, it will 
> automatically
> pass them *all* into DMD for you. This way, you don't have to
> manually keep track of all your files and pass them all into
> DMD youself. Just give RDMD your main file and that's it, 
> you're golden.
>
> Side note: Another little trick with RDMD: Omit the 
> --build-only and it will compile AND then run your program:

> Yes. (Unless you never import anything from in phobos...I 
> think.) But
> it's very, very fast to parse. Lightning-speed if you compare 
> it to C++.

I don't even want to legitimize C++ compiler speed by comparing 
it to any other language ;)

>> - Is there any concept of an incremental build?
>
> Yes, but there's a few "gotcha"s:
>
> 1. D compiles so damn fast that it's not nearly as much of an 
> issue as
> it is with C++ (which is notoriously ultra-slow compared
> to...everything, hence the monumental importance of C++'s 
> incremental
> builds).

I figure as CTFE is used more, especially when it is used to 
decide which template overloads are valid or how a mixin will 
behave, this will slow down the compiler more and more, thus 
making incremental builds more important. A typical example would 
be a compile-time parser-generator, or compiled regexes.

Plus, I've heard some people complaining that the compiler uses 
over 1 GB RAM, and splitting up compilation into parts might help 
with that.

BTW, I think I heard the compiler uses multithreading to speed up 
the build, is that right?

> It keeps diving deeper and deeper to find anything it can 
> "start" with.
> One it finds that, it'll just build everything back up in 
> whatever
> order is necessary.

I hope someone can give more details about this.

>> - In light of the above (that the meaning of D code can be 
>> interdependent with other D code, plus the presence of mixins 
>> and all that), what are the limitations of 
>> __traits(allMembers...) and other compile-time reflection 
>> operations, and what kind of problems might a user expect to 
>> encounter?
>
> Shouldn't really be an issue. Such things won't get evaluated 
> until the
> types/identifiers involved are *fully* analyzed (or at least to 
> the
> extent that they need to be analyzed). So the results of things 
> like
> __traits(allMembers...) should *never* change during 
> compilation, or
> when changing the order of files or imports (unless there's some
> compiler bug). Any situation that *would* result in any such 
> ambiguity
> will get flagged as an error in your code.

Hmm. Well, I couldn't find an obvious example... for example, you 
are right, this doesn't work, although the compiler annoyingly 
doesn't give a reason:

struct OhCrap {
	void a() {}
	// main.d(72): Error: error evaluating static if expression
	//             (what error? syntax error? type error? c'mon...)
	static if ([ __traits(allMembers, OhCrap) ].length > 1) {
		auto b() { return 2; }
	}
	void c() {}
}

But won't this be a problem when it comes time to produce 
run-time reflection information? I mean, when module A asks to 
create run-time reflection information for all the functions and 
types in module A.... er, I naively thought the information would 
be created as a set of types and functions *in module A*, which 
would then change the set of allMembers of A. But, maybe it makes 
more sense to create that stuff in a different module (which A 
could then import??)

Anyway, I can't even figure out how to enumerate the members of a 
module A; __traits(allMembers, A) causes "Error: import Y has no 
members".

Aside: I first wrote the above code as follows:

// Shouldn't this be in Phobos somewhere?
bool contains(alias pred = "a == b", R, E)(R haystack, E needle)
     if (isInputRange!R &&
         is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{
	return !(find!(pred, R, E)(haystack, needle).empty);
}

struct OhCrap {
	void a() {}
	static if ([ __traits(allMembers, OhCrap) ].contains("a")) {
		auto b() { return 2; }
	}
	void c() {}
}

But it causes a series of 204 error messages that I don't 
understand.


More information about the Digitalmars-d mailing list