AST files instead of DI interface files for faster compilation and easier distribution

timotheecour thelastmammoth at gmail.com
Tue Jun 12 02:07:10 PDT 2012


There's a current pull request to improve di file generation 
(https://github.com/D-Programming-Language/dmd/pull/945); I'd 
like to suggest further ideas.
As far as I understand, di interface files try to achieve these 
conflicting goals:

1) speed up compilation by avoiding having to reparse large files 
over and over.
2) hide implementation details for proprietary reasons
3) still maintain source code in some form to allow inlining and 
CTFE
4) be human readable

-Goals 2) and 3) are clearly contradictory, so that calls for a 
command line switch (eg -hidesource), which should be off by 
default, which when set will indeed remove any implementation 
details (where possible, ie for non-template and non-auto-return 
functions) but as a counterpart also prevent any chance for 
inlining/CTFE for the corresponding exported API. That choice 
will be left to the user.

-Regarding point 1), it won't be untypical to have a D interface 
file to be almost as large (and slow to parse) as the original 
source file, even with the upcoming di file improvements 
(dmd/pull/945), as D encourages the use of templates/auto-return 
throughout (a large part of phobos would be left 
quasi-unchanged). In fact, the fast compile time of D _does_ 
suffer when there are heavy use of templates, or scaling up.

So to make interface files really useful in terms of speeding up 
compilation, why not directly store the AST (could be text-based 
like JSON but preferably a portable binary format for speed, call 
it ".dib" file), with possibly some amount of analysis (eg: 
version(windows) could be pre-handled). This would be analoguous 
to precompiled header files 
(http://en.wikipedia.org/wiki/Precompiled_header), which don't 
exist in D AFAIK. This could be done by extending the currently 
incomplete json file generation by dmd, to include AST of 
implementation of each function we want to export such as 
templates or stuff to inline). During compilation of a module, 
"import myfun;" would look for 1) myfun.dib (binary or json 
precompiled interface file), 2) myfun.di (if still needed), 3) 
myfun.d.



We could even go a step further, borrowing some ideas from the 
"framework" feature found in OSX to distribute components: a 
single D framework would combine the AST (~ precompiled .dib 
headers) of a set of D modules and a set of libraries.
The user would then use a framework as follows:

     dmd -L-framework mylib -L-Lpath/to/mylib main.d

or simply:

     dmd main.d

if main.d contains pragma(framework,"mylib") and framework mylib 
is in the search path

As in OSX's frameworks, framework mylib is used both during 
compilation (resolving import statements in main.d) and linking. 
Upon encountering an "import myfun;" declaration, the compiler 
would search the linked in frameworks for a symbol or file 
representing the corresponding AST of module myfun, and if not 
found, use the default import mechanism.
That will both speed up compilation times and make distribution 
of libraries and versioning a breeze: single framework to 
download and to link against (this is different from what rdmd 
does). On OSX, frameworks appear as a single file in Finder but 
are actually directories; here we could have either a single file 
or a directory as well.

Finally, regarding point 4), a simple command line switch (eg dmd 
--pretty-print myfun.di) will pretty-print to stdout the AST, and 
omit the implementation of templates and auto functions for 
brevity, so they appear as simple di files (but some options 
could filter out AST nodes for IDE use, etc).

Thanks for your comments!


More information about the Digitalmars-d mailing list