Let's stop parser Hell

Jonathan M Davis jmdavisProg at gmx.com
Sat Jul 7 10:53:56 PDT 2012


On Saturday, July 07, 2012 13:05:29 Jacob Carlborg wrote:
> On 2012-07-07 03:12, Jonathan M Davis wrote:
> > Now, the issue of a "strong, dependable formalization of D's syntax" is
> > another thing entirely. Porting dmd's lexer and parser to Phobos would
> > keep
> > the Phobos implementation in line with dmd much more easily and avoid
> > inconsistencies in the language definition and the like. However, if we
> > write a new lexer and parser specifically for Phobos which _doesn't_ port
> > the lexer or parser from dmd, then that _would_ help drive making the
> > spec match the compiler (or vice versa). So, I agree that could be a
> > definite argument for writing a lexer and parser from scratch rather than
> > porting the one from dmd, but I don't buy the bit about it smothering
> > parser generators at all. I think that the use cases are completely
> > different.
> 
> I think the whole point of having a compiler as a library is that the
> compiler should use the library as well. Otherwise the two will get out
> of sync.
> 
> Just look at Clang, LLVM, LLDB and Xcode, they took the correct
> approach. Clang and LLVM (and I think LLDB) are available as libraries.
> Then the compiler, debugger (lldb) and IDE uses these libraries as part
> of their implementation. They don't have their own implementation that
> is similar to the libraries, making it "easy" to stay in sync. They
> _use_ the libraries as libraries.
> 
> This is what DMD and Phobos should do as well. If it's too complicated
> to port the lexer/parser to D then it would be better, at least as a
> first step, to modify DMD as needed. Create a C API for DMD and then
> create D bindings to be put into Phobos.

There are multiple issues here. The one that Andrei is raising is the fact 
that D isn't properly formalized. Essentially, the compiler _is_ the spec, and 
while the online spec _mostly_ follows it, it doesn't entirely, and the online 
spec isn't always as precise as it needs to be regardless. With a fully 
formalized spec, it should be possible to fully implement a D compiler from 
the spec alone, and I don't think that that's currently possible.

Writing a D lexer and parser (if not a full-blown compiler) from scratch would 
help highlight the places in the spec which are lacking, and having it be part 
of Phobos would arguably increase Walter's incentive to make sure that the 
spec is in line with the compiler (and vice versa) so that stuff _other_ than 
the compiler which is based on the spec would be able to match the compiler.

Clang is in a _completely_ different situation. It's a C/C++ compiler, and both 
C and C++ already have official, formalized specs which Clang conforms to (or is 
supposed to anyway). Clang has no control over the spec at all. It merely 
implements it. So, there is no issue of trying to keep other tools or 
compilers in line with Clang due to it being the language's spec like we 
effectively have with dmd. It may help the tools which use Clang to be fully in 
line with Clang and not have to worry about whether Clang implements the spec 
slightly differently, but in theory, if they all follow the spec correctly, 
that isn't in issue (though obviously in practice it can be).

In D's case, all of the major D compilers use the same frontend, which helps 
compatability but harms the spec, because there's less incentive to keep it 
precise and  up-to-date. So, from the perspective of the spec, implementing 
the D lexer and parser for Phobos separately from dmd would be of great 
benefit.

IMHO, the reason that porting dmd's lexer and parser would be of great benefit 
is primarily maintenance. It makes it much easier to keep Phobos' lexer and 
parser in line with dmd, making discrepencies less likely, but it arguably 
reduces the incentive to improve the spec.

The benefits of having a lexer and parser as a library (regardless of whether 
it's from scratch or a port from dmd) are primarly derived from the fact that 
it makes it much easier to create tools which use them. Such tools no longer 
have to write their own lexers or parsers.

If the compiler uses the same library, it has the added benefit of making it so 
that any tool using the library will be in sync with the compiler, but if the 
spec is properly formalized and up-to-date, and the library is kep up-to-date 
with _it_, that shouldn't really be a problem. You still have the debate as to 
whether it's better to have a separate implementation based on the spec 
(thereby making it more likely that the spec is correct) or whether it's 
better to have the compiler share the implementation so that the library is 
guaranteed to match the compiler (though not necessarily the spec), but I 
think that that's a separate debate from whether we should have the lexer and 
parser as a library.

In all honesty though, I would be surprised if you could convince Walter to 
switch dmd's frontend to Phobos' lexer and parser even once they've been 
implemented. So, while I agree that there are benefits in doing so, I'm not 
sure how much chance you have of ever getting any traction with that.

Another thing to consider is that that might make it so that gdc and ldc 
couldn't share the same frontend with dmd (assuming that they have to keep 
their frontends in C or C++ -  I don't know if they do) - but if so, that 
would increase the incentive for the spec to be correct if dmd ever started 
using a D frontend.

- Jonathan M Davis


More information about the Digitalmars-d mailing list