Lexer and parser generators using CTFE

Ken kennethuil at gmail.com
Fri Jun 1 08:39:12 PDT 2012


On Tuesday, 28 February 2012 at 07:59:16 UTC, Andrei Alexandrescu 
wrote:
> I'm starting a new thread on this because I think the matter is 
> of strategic importance.
>
> We all felt for a long time that there's a lot of potential in 
> CTFE, and potential applications have been discussed more than 
> a few times, ranging from formatting strings parsed to DSLs and 
> parser generators.
>
> Such feats are now approaching fruition because a number of 
> factors converge:
>
> * Dmitry Olshansky's regex library (now in Phobos) generates 
> efficient D code straight from regexen.
>
> * The scope and quality of CTFE has improved enormously, making 
> more advanced uses possible and even relatively easy (thanks 
> Don!)
>
> * Hisayuki Mima implemented a parser generator in only 3000 
> lines of code (sadly, no comments or documentation yet :o))
>
> * With the occasion of that announcement we also find out 
> Philippe Sigaud has already a competing design and 
> implementation of a parser generator.
>
> This is the kind of stuff I've had an eye on for the longest 
> time. I'm saying it's of strategic importance because CTFE 
> technology, though not new and already available with some 
> languages, has unique powers when combined with other features 
> of D. With CTFE we get to do things that are quite literally 
> impossible to do in other languages.
>
> We need to have a easy-to-use, complete, seamless, and 
> efficient lexer-parser generator combo in Phobos, pronto. The 
> lexer itself could use a character-level PEG or a classic 
> automaton, and emit tokens for consumption by a parser 
> generator. The two should work in perfect tandem (no need for 
> glue code). At the end of the day, defining a complete 
> lexer+parser combo for a language should be just a few lines 
> longer than the textual representation of the grammar itself.
>
> What do you all think? Let's get this project off the ground!
>
>
> Thanks,
>
> Andrei

Great!  So what's the next step?  Do we wait for the maintainers 
of one of the CTFE parser gen packages to drop it in the Phobos 
Review Queue?  Do a reimplementation for Phobos?

We could attack this in pieces.  Start with a lexer/FSA generator 
(like Ragel but using CTFE) - this will make it much easier to 
consume many wire protocols, for starters (I found it very easy 
to make a simple HTTP client using Ragel), and will be quite 
useful on its own.  Then extend it into a lexer for a parser 
generator.


More information about the Digitalmars-d mailing list