Lexer and parser generators using CTFE

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Wed Feb 29 08:41:22 PST 2012


On 2/28/12 7:16 PM, Christopher Bergqvist wrote:
> I agree that the current direction of D in this area is impressive.
> However, I fail to see a killer-feature in generating a lexer-parser
> generator at compile-time instead of run-time.
>
> A run-time generator would benefit from not having to execute within the
> limited CTFE environment and would always be on-par in that respect. A
> compile time generator would internalize the generation and compilation
> of the result (with possible glue-code), simplifying the build process
> somewhat.
>
> What am I failing to pick up on?

Barrier of entry and granularity of approach, I think.

Currently if one wants to parse some simple grammar, there are options 
such as (a) do it by hand, (b) use boost::spirit, or (c) use lex/yacc.

Parsing by hand has the obvious disadvantages. Using boost::spirit has a 
steep learning curve and tends to create very contorted grammar 
representations, full of representation noise, and scales very poorly. 
Using lex/yacc is hamfisted - there's an additional build step, 
generated files to deal with, and the related logistics, which make 
lex/yacc a viable choice only for "big" grammars.

An efficient, integrated parser generator would lower the barrier of 
entry dramatically - if we play our cards right, even a sprintf 
specifier string could be parsed simpler and faster using an embedded 
grammar, instead of painfully writing the recognizer by hand. Parsing 
config files, XML, JSON, CSV, various custom file formats and many 
others - all would all be a few lines away. Ideally a user who has a 
basic understanding of grammars should have an easier time using a small 
grammar to parse simple custom formats, than writing the parsing code by 
hand.


Andrei


More information about the Digitalmars-d mailing list