DDMD and such.
Jonathan M Davis
jmdavisProg at gmx.com
Wed Sep 28 14:23:14 PDT 2011
On Wednesday, September 28, 2011 14:03 Nick Sabalausky wrote:
> "Jonathan M Davis" <jmdavisProg at gmx.com> wrote in message
> news:mailman.271.1317243599.26225.digitalmars-d at puremagic.com...
>
> > On Wednesday, September 28, 2011 13:43 Nick Sabalausky wrote:
> >> "Jonathan M Davis" <jmdavisProg at gmx.com> wrote in message
> >> news:mailman.261.1317239287.26225.digitalmars-d at puremagic.com...
> >>
> >> > I would point out that there is an intention to eventually get a D
> >> > lexer
> >> > and
> >> > parser into Phobos so that tools can take advantage of them. Those
> >> > could
> >> > eventually lead to a frontend in D but would provide benefits far
> >> > beyond
> >> > simply
> >> > having the compiler in D.
> >>
> >> Is the interest more in a D-specific lexer/parser or a generalized one?
> >> Or
> >> is it more of a split vote? I seem to remember interest both ways, but I
> >> don't know whether there's any consensus among the DMD/Phobos crew.
> >>
> >> A generalized lexer is nothing more than a regex engine that has more
> >> than
> >> one distinct accept state (which then gets run over and over until EOF).
> >> And the FSM is made simply by doing a combined regex "(regexForToken1 |
> >> regexForToken2 | regexForToken3 | ... )", and then each of those parts
> >> just get their own accept state. Which makes me wonder...
> >>
> >> There was a GSoC project to overhaul Phobos's regex engine, wasn't
> >> there? Is that done? Is it designed in a way that the stuff above
> >> wouldn't be real hard to add?
> >>
> >> And what about algoritm? Is it a Thompson NFA, ie, it traverses the NFA
> >> as
> >> if it were a DFA, effectively "creating" the DFA on-the-fly)? Or does it
> >> just traverse the NFA as an NFA? Or does it create an actual DFA and
> >> traverse that? An actual DFA would probably be best for a lexer. If a
> >> DFA,
> >> is it an optimized DFA? In my (limited) tests, it didn't seem like
> >> DFA-optimization would yield a notable benefit on typical
> >> programming-langauge tokens. It seems to be more suited to pathological
> >> cases.
> >
> > There is some desire to have a lexer and parser in Phobos which basically
> > have
> > the same implementation as dmd (only in D instead of C++). That way,
> > they're
> > very close to the actual compiler, and it's easy to port fixes and
> > improvements between the two.
>
> The lexer seems like something that would change only on rare occasions. Am
> I wrong?
Once, it's appropriately stable, _most_ of the compiler should be changing on
relatively rare occasions, save for bug fixes. Granted, the lexer is probably
less likely have bugs than many other parts of the compiler, so it's that much
less likely to change, but I'm not sure that the rate of change was really the
point.
The primary adavantages to having the compiler's lexer and Phobos' lexer be
near identical implementation-wise are that their behavior is that much closer
to being guaranteed to be the same and that it's easier to port fixes and
changes between them. Yes, if you have to port changes more often, the
similarities between the two are that much more critical, but even without
having to make such changes often, it's easier to make such changes when you
_do_ have to make them, and the simple fact that the behavior is almost
guaranteed to be the same is definitely valuable.
It may be that in the long run, the implementation could be changed to be less
dmd-compatiblity while keeping the same API - especially if dmd's frontend
ever changes to D (since then there would be no need to keep it in sync with
the C++ implementation) - but in the short term, it's definitely beneficial to
have them be near-identical. And Walter isn't going to want to use a generated
lexer or parser for dmd's frontend anyway (he's always used and prefers hand-
rolled ones), so if you want to try and get dmd's frontend to be in D, then
porting dmd's C++ frontend to D is your best bet anyway.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list