Let's stop parser Hell
Jonathan M Davis
jmdavisProg at gmx.com
Sat Jul 7 10:53:56 PDT 2012
On Saturday, July 07, 2012 13:05:29 Jacob Carlborg wrote:
> On 2012-07-07 03:12, Jonathan M Davis wrote:
> > Now, the issue of a "strong, dependable formalization of D's syntax" is
> > another thing entirely. Porting dmd's lexer and parser to Phobos would
> > keep
> > the Phobos implementation in line with dmd much more easily and avoid
> > inconsistencies in the language definition and the like. However, if we
> > write a new lexer and parser specifically for Phobos which _doesn't_ port
> > the lexer or parser from dmd, then that _would_ help drive making the
> > spec match the compiler (or vice versa). So, I agree that could be a
> > definite argument for writing a lexer and parser from scratch rather than
> > porting the one from dmd, but I don't buy the bit about it smothering
> > parser generators at all. I think that the use cases are completely
> > different.
>
> I think the whole point of having a compiler as a library is that the
> compiler should use the library as well. Otherwise the two will get out
> of sync.
>
> Just look at Clang, LLVM, LLDB and Xcode, they took the correct
> approach. Clang and LLVM (and I think LLDB) are available as libraries.
> Then the compiler, debugger (lldb) and IDE uses these libraries as part
> of their implementation. They don't have their own implementation that
> is similar to the libraries, making it "easy" to stay in sync. They
> _use_ the libraries as libraries.
>
> This is what DMD and Phobos should do as well. If it's too complicated
> to port the lexer/parser to D then it would be better, at least as a
> first step, to modify DMD as needed. Create a C API for DMD and then
> create D bindings to be put into Phobos.
There are multiple issues here. The one that Andrei is raising is the fact
that D isn't properly formalized. Essentially, the compiler _is_ the spec, and
while the online spec _mostly_ follows it, it doesn't entirely, and the online
spec isn't always as precise as it needs to be regardless. With a fully
formalized spec, it should be possible to fully implement a D compiler from
the spec alone, and I don't think that that's currently possible.
Writing a D lexer and parser (if not a full-blown compiler) from scratch would
help highlight the places in the spec which are lacking, and having it be part
of Phobos would arguably increase Walter's incentive to make sure that the
spec is in line with the compiler (and vice versa) so that stuff _other_ than
the compiler which is based on the spec would be able to match the compiler.
Clang is in a _completely_ different situation. It's a C/C++ compiler, and both
C and C++ already have official, formalized specs which Clang conforms to (or is
supposed to anyway). Clang has no control over the spec at all. It merely
implements it. So, there is no issue of trying to keep other tools or
compilers in line with Clang due to it being the language's spec like we
effectively have with dmd. It may help the tools which use Clang to be fully in
line with Clang and not have to worry about whether Clang implements the spec
slightly differently, but in theory, if they all follow the spec correctly,
that isn't in issue (though obviously in practice it can be).
In D's case, all of the major D compilers use the same frontend, which helps
compatability but harms the spec, because there's less incentive to keep it
precise and up-to-date. So, from the perspective of the spec, implementing
the D lexer and parser for Phobos separately from dmd would be of great
benefit.
IMHO, the reason that porting dmd's lexer and parser would be of great benefit
is primarily maintenance. It makes it much easier to keep Phobos' lexer and
parser in line with dmd, making discrepencies less likely, but it arguably
reduces the incentive to improve the spec.
The benefits of having a lexer and parser as a library (regardless of whether
it's from scratch or a port from dmd) are primarly derived from the fact that
it makes it much easier to create tools which use them. Such tools no longer
have to write their own lexers or parsers.
If the compiler uses the same library, it has the added benefit of making it so
that any tool using the library will be in sync with the compiler, but if the
spec is properly formalized and up-to-date, and the library is kep up-to-date
with _it_, that shouldn't really be a problem. You still have the debate as to
whether it's better to have a separate implementation based on the spec
(thereby making it more likely that the spec is correct) or whether it's
better to have the compiler share the implementation so that the library is
guaranteed to match the compiler (though not necessarily the spec), but I
think that that's a separate debate from whether we should have the lexer and
parser as a library.
In all honesty though, I would be surprised if you could convince Walter to
switch dmd's frontend to Phobos' lexer and parser even once they've been
implemented. So, while I agree that there are benefits in doing so, I'm not
sure how much chance you have of ever getting any traction with that.
Another thing to consider is that that might make it so that gdc and ldc
couldn't share the same frontend with dmd (assuming that they have to keep
their frontends in C or C++ - I don't know if they do) - but if so, that
would increase the incentive for the spec to be correct if dmd ever started
using a D frontend.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list