Let's stop parser Hell

Jonathan M Davis jmdavisProg at gmx.com
Sun Jul 8 14:03:15 PDT 2012


On Sunday, July 08, 2012 22:15:26 Roman D. Boiko wrote:
> On Sunday, 8 July 2012 at 20:06:07 UTC, Jonathan M Davis wrote:
> > most of the focus right now from people interested in parsing
> > seems to be on pegged and parser generators (which are very
> > cool and in some ways much more interesting, but I seriously
> > question that that's the performant way to go if you're looking
> > to parse D specifically).
> 
> Can you provide a *specific* example of performance optimization
> which a custom D compiler would have, but parser generator would
> be unlikely to catch up.

It's been too long since I was actively working on parsers to give any 
details, but it is my understanding that because a hand-written parser is 
optimized for a specific grammar, it's going to be faster. Also, look at dmd 
and dmc vs other compilers. They use hand-written parsers and are generally 
much faster than their competitors.

One thing to remember about hand-written parsers vs generative ones though is 
that they usually are completely different in terms of the type of parser that 
you write (e.g. hand-written parsers are generally recursive-decent parser 
whereas generative ones usually use bottom-up parsers). So, that could have a 
large impact on performance as well (in either direction).

To be clear though, I have _no_ problem with having a generative parser in 
Phobos (or having other 3rd party options available). Parsers like pegged are 
extremely cool and extremely useful. It's just that it's my understanding that 
well-written hand-written parsers are faster than generated ones, so I think 
that it would be benecial to have a hand-written parser for D in Phobos _in 
addition_ to a general, generative solution.

But to fully prove that a hand-written one would be faster, we'd of course 
have to have actual solutions to compare. And if the API for a D-specific 
parser in Phobos is designed well enough, and it somehow proved that a 
generative solution was faster, then the hand-written one could be replaced by 
the generative one underneat the hood.

- Jonathan M Davis


More information about the Digitalmars-d mailing list