DDMD and such.
Dmitry Olshansky
dmitry.olsh at gmail.com
Wed Sep 28 14:40:23 PDT 2011
On 29.09.2011 0:43, Nick Sabalausky wrote:
> "Jonathan M Davis"<jmdavisProg at gmx.com> wrote in message
> news:mailman.261.1317239287.26225.digitalmars-d at puremagic.com...
>>
>> I would point out that there is an intention to eventually get a D lexer
>> and
>> parser into Phobos so that tools can take advantage of them. Those could
>> eventually lead to a frontend in D but would provide benefits far beyond
>> simply
>> having the compiler in D.
>>
>
> Is the interest more in a D-specific lexer/parser or a generalized one? Or
> is it more of a split vote? I seem to remember interest both ways, but I
> don't know whether there's any consensus among the DMD/Phobos crew.
>
> A generalized lexer is nothing more than a regex engine that has more than
> one distinct accept state (which then gets run over and over until EOF). And
> the FSM is made simply by doing a combined regex "(regexForToken1 |
> regexForToken2 | regexForToken3 | ... )", and then each of those parts just
> get their own accept state. Which makes me wonder...
>
> There was a GSoC project to overhaul Phobos's regex engine, wasn't there? Is
> that done? Is it designed in a way that the stuff above wouldn't be real
> hard to add?
That was me. And, yes, that's exactly the kind of thing I'm pursing
right now. Lexer with compile-time generated FSM looks very promising
e.g. for DSELs.
In fact, I think I already have good vision on how to reuse existing
code to get there.
>
> And what about algoritm? Is it a Thompson NFA, ie, it traverses the NFA as
> if it were a DFA, effectively "creating" the DFA on-the-fly)?
As for implementation there are two algorithms operating on the same IR.
The default one is Thompson NFA, so yes with some work I can make it
generate code for DFA instead of executing. It's just till now I
generated more simple structures from it e.g. bit-parallel prefix searcher.
Or does it
> just traverse the NFA as an NFA? Or does it create an actual DFA and
> traverse that? An actual DFA would probably be best for a lexer. If a DFA,
> is it an optimized DFA? In my (limited) tests, it didn't seem like
> DFA-optimization would yield a notable benefit on typical
> programming-langauge tokens. It seems to be more suited to pathological
> cases.
>
I'm also dubious if DFA optimization techniques would worth the hassle.
Not a top priority.
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list