AST files instead of DI interface files for faster compilation and easier distribution
timotheecour
thelastmammoth at gmail.com
Tue Jun 12 02:07:10 PDT 2012
There's a current pull request to improve di file generation
(https://github.com/D-Programming-Language/dmd/pull/945); I'd
like to suggest further ideas.
As far as I understand, di interface files try to achieve these
conflicting goals:
1) speed up compilation by avoiding having to reparse large files
over and over.
2) hide implementation details for proprietary reasons
3) still maintain source code in some form to allow inlining and
CTFE
4) be human readable
-Goals 2) and 3) are clearly contradictory, so that calls for a
command line switch (eg -hidesource), which should be off by
default, which when set will indeed remove any implementation
details (where possible, ie for non-template and non-auto-return
functions) but as a counterpart also prevent any chance for
inlining/CTFE for the corresponding exported API. That choice
will be left to the user.
-Regarding point 1), it won't be untypical to have a D interface
file to be almost as large (and slow to parse) as the original
source file, even with the upcoming di file improvements
(dmd/pull/945), as D encourages the use of templates/auto-return
throughout (a large part of phobos would be left
quasi-unchanged). In fact, the fast compile time of D _does_
suffer when there are heavy use of templates, or scaling up.
So to make interface files really useful in terms of speeding up
compilation, why not directly store the AST (could be text-based
like JSON but preferably a portable binary format for speed, call
it ".dib" file), with possibly some amount of analysis (eg:
version(windows) could be pre-handled). This would be analoguous
to precompiled header files
(http://en.wikipedia.org/wiki/Precompiled_header), which don't
exist in D AFAIK. This could be done by extending the currently
incomplete json file generation by dmd, to include AST of
implementation of each function we want to export such as
templates or stuff to inline). During compilation of a module,
"import myfun;" would look for 1) myfun.dib (binary or json
precompiled interface file), 2) myfun.di (if still needed), 3)
myfun.d.
We could even go a step further, borrowing some ideas from the
"framework" feature found in OSX to distribute components: a
single D framework would combine the AST (~ precompiled .dib
headers) of a set of D modules and a set of libraries.
The user would then use a framework as follows:
dmd -L-framework mylib -L-Lpath/to/mylib main.d
or simply:
dmd main.d
if main.d contains pragma(framework,"mylib") and framework mylib
is in the search path
As in OSX's frameworks, framework mylib is used both during
compilation (resolving import statements in main.d) and linking.
Upon encountering an "import myfun;" declaration, the compiler
would search the linked in frameworks for a symbol or file
representing the corresponding AST of module myfun, and if not
found, use the default import mechanism.
That will both speed up compilation times and make distribution
of libraries and versioning a breeze: single framework to
download and to link against (this is different from what rdmd
does). On OSX, frameworks appear as a single file in Finder but
are actually directories; here we could have either a single file
or a directory as well.
Finally, regarding point 4), a simple command line switch (eg dmd
--pretty-print myfun.di) will pretty-print to stdout the AST, and
omit the implementation of templates and auto functions for
brevity, so they appear as simple di files (but some options
could filter out AST nodes for IDE use, etc).
Thanks for your comments!
More information about the Digitalmars-d
mailing list