Building C++ modules

Mon Aug 12 19:58:10 UTC 2019

On Mon, Aug 12, 2019 at 12:33:07PM -0700, Walter Bright via Digitalmars-d wrote:
[...]
> 1. Parsing is completely separate from semantic analysis. I.e. all
> code can be lexed/parsed in parallel, or in any order, without concern
> for dependencies.

This is a big part of why C++'s must-be-parsed-before-it-can-be-lexed
syntax is a big hindrance to meaningful progress.  The only way such a
needlessly over-complex syntax can be handled is a needlessly
over-complex lexer/parser combo, which necessarily results in needlessly
over-complex corner cases and other such gotchas.  Part of this
nastiness is the poor choice of template syntax (overloading '<' and '>'
to be delimiters in addition to their original roles of comparison
operators), among several other things.

> 2. Semantic analysis is lazy, i.e. it is done symbol-by-symbol on
> demand. In the above example, when y is encountered, the compiler goes
> "y is an enum, I'd better suspend the semantic analysis of x and go do
> the semantic analysis for y now".

This is an extremely powerful approach, and many may not be aware that
it's a powerful cornerstone on which D's meta-programming capabilities
are built.  It's a beautiful example of the principle of "laziness":
don't do the work until it's actually necessary. Something too many
applications of today fail to observe, with their own detriment.

> 3. D is able to partially semantically analyze things. This comes into
> play when two structs mutually refer to each other. It does this well
> enough that only rarely do "circular reference" errors come up that
> possibly could be resolved.

I wasn't aware of this before, but it makes sense, in retrospect.

> D processes imports by reading the file and doing a parse on them.
> Only "on demand" does it do semantic analysis on them. My original
> plan was to parse them and write a binary file, and then the import
> would simply and fastly load the binary file. But it turned out the
> code to load the binary file and turn it into an AST wasn't much
> better than simply parsing the source code, so I abandoned the binary
> file approach.
[...]

That's an interesting data point.  I've been toying with the same idea
over the years, but it seems that's a dead-end approach.  In any case,
from what I've gathered parsing and lexing are nowhere near the
bottleneck as far as D compilation is concerned (though it might be
different for a language like C++, but even there I doubt it would play
much of a role in the overall compilation performance, there being far
more complex problems in semantic analyses and codegen that require
algorithms with non-trivial running times).  There are bigger fish to
fry elsewhere in the compiler.

(Like *cough*memory usage*ahem*, that to this day makes D a laughing
stock on low-memory systems. Even with -lowmem the situation today isn't
much better from a year or two ago. I find my hands tied w.r.t. D as far
as low-memory systems are concerned, and that's a very sad thing, since
I'd have liked to replace many things with D. Currently I can't, because
either dmd outright won't run and I have to build executables offline
and upload them, or else I have to build the dmd toolchain offline and
upload it to the low-memory target system. Both choices suck.)

T

-- 
Why can't you just be a nonconformist like everyone else? -- YHL