D Parsing (again)/ D grammar

Philippe Sigaud via Digitalmars-d digitalmars-d at puremagic.com
Thu Oct 2 11:20:27 PDT 2014


> Chances are that I will be able to get the original GLL parser generator
> from one of algorithm authors (Adrian Johnstone). He's really helpful here.
> From that point, I will only have to add a frontend for generating a
> concrete parser, starting with Python - it already has a fully working
> grammar. Hopefully, I will also be able to find a suitable grammar for D, it
> is always a pleasure to play with the language.
>
> Otherwise, I will continue my current effort - implementing the GLL parser
> generator in D.

I did that during this summer, almost to the point it was
self-sustained (that is, the GLL parser generator was used to generate
a parser for grammars files). I chose a range interface: input is a
string, the parse forest is output as a range of parse trees.

I loved to see it generate the different possible parse trees on
ambiguous grammars, and accepting left- and right-recursing grammars!
GLL is a very noce algorithm.

Halas, even with some shortcuts on Kleene stars it was quite slow. I
tried to use threads (spawning worker threads on alternatives), but
that did not change the timings much.

I could make it generate a parser for JSON and compared it to the
jsonx module that Sönke presented here. Bah, it was 1000 times slower
(which is all relative: it still takes only a few ms to parse a big
JSON file. But still, far away from the microseconds it should need).
Pegged was faster that this GLL generator, but of course still far
slower than jsonx.

[Of course, Pegged can cheat: it can parse your file at compile-time,
resulting in 0-µs parsing time at runtime :-)]

Also, the way I coded it I hit CTFE limits and could not have it parse
at compile-time. A shame, really.

> This is one of the reasons I prefer LL/RD parsers :-) They are easy to
> follow.

I would like to try a LALR compile-time generator and compile-time
parser. I'm pretty sure LALR tables could be expanded directly as D
code instead of being read during parsing.

Philippe



More information about the Digitalmars-d mailing list