Compilation strategy

H. S. Teoh hsteoh at quickfur.ath.cx
Sun Dec 16 16:48:17 PST 2012


On Sun, Dec 16, 2012 at 03:06:16PM -0800, Jonathan M Davis wrote:
> On Sunday, December 16, 2012 23:32:38 Andrej Mitrovic wrote:
> > On 12/16/12, Paulo Pinto <pjmlp at progtools.org> wrote:
> > > If modules are used correctly, a .di should be created with the
> > > public interface and everything else is already in binary format,
> > > thus the compiler is not really parsing everything all the time.
> > 
> > A lot of D code tends to be templated code, .di files don't help you
> > in that case.
> 
> And .di files don't work with CTFE or inlining. In general, .di files
> are a horrible idea.

Yeah, that's another thing that C++ did wrong, that unfortunately we
sorta inherited in a partial way.


> I tend to be of the opinion that they shouldn't even exist, but some
> corporate types require that sort of thing when distributing libraries
> to 3rd parties, so we need some sort of header solution. A better one
> probably would have been a binary format where the code is partially
> compiled with documentation providing a human-readable API, but that's
> something that we'll have to look into in the future. For now, we're
> stuck with .di files.
[...]

I've proposed this before: when compiling a library, the compiler should
process everything into an intermediate form (perhaps even the IR that's
used internally to integrate with the codegen), then save that as a sort
of partially-compiled object format. This can either be in the form of
an object file, or perhaps a custom D-specific format. Everything,
including templates, function bodies, etc., is included.  Then when you
import something from the library, the compiler simply loads the IR from
the intermediate format and use that info to compile the code.

This eliminates the wasted time relexing and reparsing imported files
every single time they're used, and also suitably "hides" the
implementation details of the library in a format that isn't easily
read. (I mean, let's face it, if someone is desperate enough, he can
reverse engineer *anything*, so imagining that shipping a binary is
somehow "safer" than shipping a straightforward IR is nonsense. But in
the reasonable case, the library being in IR should be disincentive
enough for people to not try to break encapsulation.)

This intermediate format can be in the form of some kind of hash table
that the compiler can quickly look up referenced symbols from, so there
will be almost no overhead.


T


-- 
They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill


More information about the Digitalmars-d mailing list