D Compiler as a Library

Roman D. Boiko rb at d-coding.com
Thu Apr 19 02:24:22 PDT 2012


On Friday, 13 April 2012 at 09:57:49 UTC, Ary Manzana wrote:
> Having a D compiler available as a library will (at least) give 
> these benefits:
>
>   1. Can be used by an IDE: D is statically typed and so an IDE 
> can benefit a lot from this. The features Descent had, as far 
> as I remember, were:
>     1.1. Outline
>     1.2. Autocompletion
>     1.3. Type Hierarchy
>     1.4. Syntax and semantic errors, showing not only the line 
> number but also column numbers if it makes sense
>     1.5. Automatic import inclusion (say, typing writefln and 
> getting a list of modules that provide that symbol)
>     1.6. Compile-time view: replace auto with the inferred 
> type, insert mixins into scope, rewrite operator overloads and 
> other lowerings (but I'm not sure this point is really useful)
>     1.7. Determine, given a set of versions and flags, which 
> branches of static ifs are used/unused
>     1.8. Open declaration
>     1.9. Show implementations (of an interface, of interface's 
> method or, abstract methods, or method overrides).
>     1.10. Propose to override a method (you type some letters 
> and then hit some key combination and get a list of methods to 
> override)
>     1.11. Get the code of a template when instantiated.
>  2. Can be used to build better doc generators: one that shows 
> known subclasses or interface implementation, shows inherited 
> methods, type hierarchy.
>  3. Can be used for lints and other such tools.
>
> As you can see, a simple lexer/parser built into an IDE, doc 
> generator or lint will just give basic features but will never 
> achieve something exceptionally good if it lacks the full 
> semantic knowledge of the code.
>
> I'll write a list of things I'd like this compiler-as-library 
> to have, but please help me make it bigger :-)
>
>  * Don't use global variables (DMD is just thought to be run 
> once, so when used as a library it can just be used, well, once)
>  * Provide a lexer which gives line numbers and column numbers 
> (beginning, end)
>  * Provide a parser with the same features
>  * The semantic phase should not discard any information found 
> while parsing. For example when DMD resolves a type it 
> recursively resolves aliasing and keeps the last one. An 
> example:
>
>   alias int foo;
>   alias foo* bar;
>
>   bar something() { ... }
>
>   It would be nice if "bar", after semantic analysis is done, 
> carries the information that bar is "foo*" and that "foo" is 
> "int". Also that something's return type is "bar", not "int*".
>  * Provide errors and warnings that have line numbers as well 
> as column numbers.
>  * Allow to parse the top-level definitions of a module. Whit 
> this I mean skipping function bodies. At least Descent first 
> built a the outline of the whole project by doing this. This 
> mode should also allow specifying a location as a target, and 
> if that location falls inside a function body then it's 
> contents are returned (useful when editing a file, so you can 
> get the outline as well as semantic info of the function 
> currently being edited, which will never affect semantic in 
> other parts of the module). This will dramatically speed up the 
> editor.
>  * Don't stop parsing on errors (I think DMD already does this).
>  * Provide a visitor class. If possible, use visitors to 
> implement semantic analysis. The visitor will make it super 
> easy to implement lints and to generate documentation.

By the way, I also started a project called <a 
href="https://github.com/roman-d-boiko/DCT">The D Compiler Tools 
(DCT)</a> about a month ago. It is provided under the Boost 
license, and has the goal to enable building third-party tools 
with functionality that would include described above. I'm trying 
to build LLVM-based codegen and also reuse frontend in a separate 
project with basic IDE functionality for D.

I have never implemented compilers before, and probably should 
have called my project SDC if only that name had not been taken 
before ;) Goals are very similar to those of SDC (especially now, 
after its re-licensing). But I don't commit to ever finish the 
project, because my free time is very limited :(.

There is SIGNIFICANTLY less functionality implemented at this 
moment than in SDC. Currently, only primitive lexing is in place 
(I follow the KISS principle where possible) and parsing of auto 
declarations (auto i = 3 * (2 + 8), etc.) with stubs for most 
other cases. (Please note that project Readme file is outdated.) 
Parser is top-down recursive descent, and it follows 
specification very closely, except some differences needed to 
simplify implementation (like using loops to implement 
left-recursion in specification).

Anyone interested in discussing DCT or participating in 
development would be welcome!


More information about the Digitalmars-d mailing list