DDMD and such.
Nick Sabalausky
a at a.a
Wed Sep 28 13:43:57 PDT 2011
"Jonathan M Davis" <jmdavisProg at gmx.com> wrote in message
news:mailman.261.1317239287.26225.digitalmars-d at puremagic.com...
>
> I would point out that there is an intention to eventually get a D lexer
> and
> parser into Phobos so that tools can take advantage of them. Those could
> eventually lead to a frontend in D but would provide benefits far beyond
> simply
> having the compiler in D.
>
Is the interest more in a D-specific lexer/parser or a generalized one? Or
is it more of a split vote? I seem to remember interest both ways, but I
don't know whether there's any consensus among the DMD/Phobos crew.
A generalized lexer is nothing more than a regex engine that has more than
one distinct accept state (which then gets run over and over until EOF). And
the FSM is made simply by doing a combined regex "(regexForToken1 |
regexForToken2 | regexForToken3 | ... )", and then each of those parts just
get their own accept state. Which makes me wonder...
There was a GSoC project to overhaul Phobos's regex engine, wasn't there? Is
that done? Is it designed in a way that the stuff above wouldn't be real
hard to add?
And what about algoritm? Is it a Thompson NFA, ie, it traverses the NFA as
if it were a DFA, effectively "creating" the DFA on-the-fly)? Or does it
just traverse the NFA as an NFA? Or does it create an actual DFA and
traverse that? An actual DFA would probably be best for a lexer. If a DFA,
is it an optimized DFA? In my (limited) tests, it didn't seem like
DFA-optimization would yield a notable benefit on typical
programming-langauge tokens. It seems to be more suited to pathological
cases.
More information about the Digitalmars-d
mailing list