Let's stop parser Hell

Dmitry Olshansky dmitry.olsh at gmail.com
Sat Jul 7 14:35:47 PDT 2012


On 08-Jul-12 01:24, Roman D. Boiko wrote:
> On Saturday, 7 July 2012 at 21:08:43 UTC, Dmitry Olshansky wrote:
>> You may misunderstood me as well, the point is still the same:
>> there are 2 things - notation and implementation, the fact that lexer
>> is integrated in notation like in PEGs is not related to the fact that
>> PEGs in their classic definition never use term Token and do
>> backtracking parsing essentially on character level.
>
> But PEG itself doesn't require backtracking parsing, does it?

It does. Specifically the algorithm is to try option A, try option B, 
try option C...

it's obvious how backtracking comes in ... whan say A fails somewhere in 
the middle of it. Of course there is a fair amount of bookkeeping that 
reduces doing work all over again but it does ... allocate.

This could be avoided via disambiguation a-la LL(k) but that won't be 
true PEG anymore ;)

So that's
> an implementation detail, or a specific algorithm.
No

Lexer separation
> seems to be independent of this.
>
Yes

>>> As for lexing multiple times, simply use a free list of terminals
>>> (aka tokens). I still assume that grammar is properly defined, so
>>> that there is only one way to split source into tokens.
>>>
>>
>> Tokens.. there is no such term in use if we talk about 'pure' PEG.
>
> Terminal symbols.
>
Characters.

-- 
Dmitry Olshansky




More information about the Digitalmars-d mailing list