Let's stop parser Hell

Sat Jul 7 14:08:41 PDT 2012

On 08-Jul-12 00:50, Roman D. Boiko wrote:
> On Saturday, 7 July 2012 at 20:29:26 UTC, Dmitry Olshansky wrote:
>> And given the backtracking nature of PEGs you'll do your distributed
>> thing many times over or ... spend a lot of RAM to remember not to
>> redo it. I recall lexing takes even more then parsing itself.
>
> I think that your conclusions are about statistical evidences of PEG
> misuses and poor PEG parser implementations. My point was that there is
> nothing fundamentally worse in having lexer integrated with parser, but
> there are performance advantages of having to check less possible cases
> when the structural information is available (so that lexSmth could be
> called when Smth is expected, thus requiring less switch branches if
> switch is used).

You may misunderstood me as well, the point is still the same:
there are 2 things - notation and implementation, the fact that lexer is 
integrated in notation like in PEGs is not related to the fact that PEGs 
in their classic definition never use term Token and do backtracking 
parsing essentially on character level.

> As for lexing multiple times, simply use a free list of terminals (aka
> tokens). I still assume that grammar is properly defined, so that there
> is only one way to split source into tokens.
>

Tokens.. there is no such term in use if we talk about 'pure' PEG.

-- 
Dmitry Olshansky