DDMD and such.

Wed Sep 28 23:28:49 PDT 2011

On 2011-09-28 22:59, Jonathan M Davis wrote:
> On Wednesday, September 28, 2011 13:43 Nick Sabalausky wrote:
>> "Jonathan M Davis"<jmdavisProg at gmx.com>  wrote in message
>> news:mailman.261.1317239287.26225.digitalmars-d at puremagic.com...
>>
>>> I would point out that there is an intention to eventually get a D lexer
>>> and
>>> parser into Phobos so that tools can take advantage of them. Those could
>>> eventually lead to a frontend in D but would provide benefits far beyond
>>> simply
>>> having the compiler in D.
>>
>> Is the interest more in a D-specific lexer/parser or a generalized one? Or
>> is it more of a split vote? I seem to remember interest both ways, but I
>> don't know whether there's any consensus among the DMD/Phobos crew.
>>
>> A generalized lexer is nothing more than a regex engine that has more than
>> one distinct accept state (which then gets run over and over until EOF).
>> And the FSM is made simply by doing a combined regex "(regexForToken1 |
>> regexForToken2 | regexForToken3 | ... )", and then each of those parts
>> just get their own accept state. Which makes me wonder...
>>
>> There was a GSoC project to overhaul Phobos's regex engine, wasn't there?
>> Is that done? Is it designed in a way that the stuff above wouldn't be
>> real hard to add?
>>
>> And what about algoritm? Is it a Thompson NFA, ie, it traverses the NFA as
>> if it were a DFA, effectively "creating" the DFA on-the-fly)? Or does it
>> just traverse the NFA as an NFA? Or does it create an actual DFA and
>> traverse that? An actual DFA would probably be best for a lexer. If a DFA,
>> is it an optimized DFA? In my (limited) tests, it didn't seem like
>> DFA-optimization would yield a notable benefit on typical
>> programming-langauge tokens. It seems to be more suited to pathological
>> cases.
>
> There is some desire to have a lexer and parser in Phobos which basically have
> the same implementation as dmd (only in D instead of C++). That way, they're
> very close to the actual compiler, and it's easy to port fixes and
> improvements between the two.
>
> However, we definitely also want a more general lexer/parser generator which
> takes advantage of D's metaprogramming capabalities. Andrei was pushing more
> for that and doesn't really like the idea of the other, since it would reduce
> the desire to produce the more general solution. So, this _is_ some dissension
> on the matter. But there's definitely room for both. It's just a question of
> time and manpower.
>
> - Jonathan M Davis

I would rather have a D specific lexer/parser than a general 
lexer/parser generator,

-- 
/Jacob Carlborg