Is there a list of things which are slow to compile?

Wed Jun 3 17:02:35 UTC 2020

On Wed, Jun 03, 2020 at 09:36:52AM +0000, drathier via Digitalmars-d-learn wrote:
> I'm wondering if there's a place that lists things which are
> slower/faster to compile? DMD is pretty famed for compiling quickly,
> but I'm not seeing particularly high speed at all, and I want to fix
> that.

The two usual culprits are:
- Recursive/chained templates
- Excessive CTFE

Note that while the current CTFE engine is slow, it's still reasonably
fast for short computations. Just don't write nested loops or loops with
a huge number of iterations inside your CTFE code, and you should be
fine. And on that note, even running std.format with all of its
complexity inside CTFE is reasonably fast, as long as you don't do it
too often; so generally you won't see a problem here unless you have
loop with too many iterations or too deeply-nested loops running in
CTFE.

Templates are generally reasonably OK, until you use too many recursive
templates. Or if you chain too many of them together, like if you have
excessively long UFCS chains with Phobos algorithms. Short chains are
generally OK, but once they start getting long they will generate large
symbols and large numbers of instantiations. Large symbols used to be a
big problem, but ever since Rainer's fix they have generally been a lot
tamer. But still, it's something to avoid unless you can't help it.

Recursive templates are generally bad because they tend to produce a
super-linear number of instantiations, which consume lots of compiler
memory and also slow things down. Use too many of them, and things will
quickly slow to a crawl.

Worst is if you combine both deeply-nested templates and CTFE, like
std.regex does. Similarly, std.format (which includes writefln & co)
tends to add 1-2 seconds to compile time.

Another is if you have an excessively long function body, IIRC there are
some O(n^2) algorithms in the compiler w.r.t. the length of the function
body. But I don't expect normal code to reach the point where this
begins to matter; generally you won't run into this unless your code is
*really* poorly written (like the entire application inside main()), or
you're using excessive code generation (like the mixin of a huge
procedurally generated string).

Identifier lengths are generally no problem unless you're talking about
100KB-long identifiers, which used to be a problem until Rainer
implemented backreferences in the mangling. But I don't expect normal
code to generate symbols of this order of magnitude unless you're using
excessively-long UFCS chains with nested templates. Identifier length
generally doesn't even register on the radar unless they're ridiculously
long, like tens or hundreds of KB long -- not something a human would
type. What humans would consider a long identifier, like Java-style
names that span 50 characters, are mere round-off error and probably
don't even make a measurable difference. The problem really only begins
to surface when you have 10,000 characters in your identifier or larger.

Comments are not even a blip on the radar: lexing is the fastest part of
the compilation process.  Similarly, aliases are extremely cheap, it's
not even on the radar. Delegates have only a runtime cost; they are
similarly unnoticeably cheap during compilation.  As are Variants,
unless you're running Variants inside CTFE (which I don't think even
works).

T

-- 
Why waste time reinventing the wheel, when you could be reinventing the engine? -- Damian Conway