D parsing

Timothee Cour thelastmammoth at gmail.com
Sun Nov 3 10:08:24 PST 2013


On Sun, Nov 3, 2013 at 1:13 AM, Philippe Sigaud
<philippe.sigaud at gmail.com>wrote:

>
> On Sun, Nov 3, 2013 at 2:45 AM, Timothee Cour <thelastmammoth at gmail.com>wrote:
>
>> 1)
>> The main issue I see with pegged is PEG grammars don't support left
>> recursion, so for example will fail on foo[1].bar(2).fun().
>> Unless there's a plan to accomodate those, I sense a dead end.
>> One can eliminate left recursion but this has issues.
>>
>
> Indeed. Eliminating left-recursion can be done, but it might disrupt the
> parse tree.
> There is no discarding the fact that PEG are intrinsically made to go with
> top-down recursive parsers.
>
>
>>
>> 2)
>> There is some material on extending PEG to support those, eg "Left
>> Recursion in Parsing Expression Grammars", or code
>> https://github.com/orlandohill/peg-left-recursion but I don't know how
>> well they work in practice.
>>
>
> I'll have a look, thanks. The idea seems similar to what I wanted to do,
> with a growing seed.
>

Great, please let us know how that works.


>
>
>
>>
>> 3)
>> Finally, a parser should be as fast as possible; I'm not sure how well
>> pegged performs compared to dmd frontend.
>>
>
>
> Oh, for D it works (it's even the biggest grammar I know), but it's too
> slow to be of practical interest. But that just means the generated parser
> is not top-notch, which is reasonable: I'm not a parsing pro, just a guy
> doing this during his free time :)
>
>
>
>> Other promising options are using lemon, a LALR(1) parser generator.
>>
>
> My current plan is to write different engines, and letting either the user
> select them at compile-time, or to have the parser decide which one to use,
> depending on the grammar. I'm pretty sure the 'Type 3' parts of a grammar
> (regular expressions) could be bone by using std.regex, for example.
>

even lexing can't be done with regex, eg nesting comments : /+ ... +/
Also, although it may seem cleaner at first to combine lexing and parsing
in 1 big grammar (as done in pegged), it usually is faster do feed a
(separate) lexer output into parser.


>
> I guess I'll have to write a CT-compatible LALR(1) engine...
>
>
>
>
>>  Growler:
>>>>
>>>> I'm using it for parsing C and Markup in two different projects.
>>>> I've also just started working on an Octave (well ok, Matlab) M-file
>>>> parser. It will probably never see the light of day but with
>>>> Pegged is great fun to play with.
>>>>
>>>> Please don't lose interest in it !!
>>>>
>>>
> OK guys, I'm hearing you. Thanks for the nice words growler! I tried to
> have Pegged simple to use, compared to other generator I know and I'm
> pleased to see it seems to work. If you have new grammars, you can send
> them to me, I'll put them in the examples.
>
>
>
>>
>>> Yes. Probably the most significant D project at this time.
>>>
>>> Andrei
>>>
>>
>>
> That's nothing compared to vide.d! But I guess it indeed demonstrates what
> can be done with the generative capabilities of D.
> Thanks for the kind words.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20131103/1f996bb5/attachment-0001.html>


More information about the Digitalmars-d mailing list