Lexer and parser generators using CTFE

Hisayuki Mima youxkei at gmail.com
Wed Feb 29 08:03:41 PST 2012


(2012/02/28 23:58), Hisayuki Mima wrote:
> (2012/02/28 16:59), Andrei Alexandrescu wrote:
>> I'm starting a new thread on this because I think the matter is of
>> strategic importance.
>>
>> We all felt for a long time that there's a lot of potential in CTFE, and
>> potential applications have been discussed more than a few times,
>> ranging from formatting strings parsed to DSLs and parser generators.
>>
>> Such feats are now approaching fruition because a number of factors
>> converge:
>>
>> * Dmitry Olshansky's regex library (now in Phobos) generates efficient D
>> code straight from regexen.
>>
>> * The scope and quality of CTFE has improved enormously, making more
>> advanced uses possible and even relatively easy (thanks Don!)
>>
>> * Hisayuki Mima implemented a parser generator in only 3000 lines of
>> code (sadly, no comments or documentation yet :o))
>>
>> * With the occasion of that announcement we also find out Philippe
>> Sigaud has already a competing design and implementation of a parser
>> generator.
>>
>> This is the kind of stuff I've had an eye on for the longest time. I'm
>> saying it's of strategic importance because CTFE technology, though not
>> new and already available with some languages, has unique powers when
>> combined with other features of D. With CTFE we get to do things that
>> are quite literally impossible to do in other languages.
>>
>> We need to have a easy-to-use, complete, seamless, and efficient
>> lexer-parser generator combo in Phobos, pronto. The lexer itself could
>> use a character-level PEG or a classic automaton, and emit tokens for
>> consumption by a parser generator. The two should work in perfect tandem
>> (no need for glue code). At the end of the day, defining a complete
>> lexer+parser combo for a language should be just a few lines longer than
>> the textual representation of the grammar itself.
>>
>> What do you all think? Let's get this project off the ground!
>>
>>
>> Thanks,
>>
>> Andrei
>
> Hello Andrei,
>
> Certainly, I don't write yet the documentation of my library, ctpg.
> (But a few examples available here:
> https://github.com/youkei/ctpg/tree/master/example)
> So, I'd like to describe it here.
>
> To be honest, ctpg is inspired by one module of Scala's standard
> library, Parser Combinators.
> One of the differences between Parser Combinators and ctpg is that
> Parser Combinators generates parsers in run-time, but ctpg generates
> parsers in compile-time by the power of CTFE and mixin.
>
> A parser generated by ctpg is a recursive descent parser, so it does
> lexical analysis and parsing at a time. And the type of input which it
> can accept is string, wstring, dstring and ForwardRange whose element
> type is char, wchar or dchar.
> So, dividing lexical analysis and parsing into two processes is
> difficult in ctpg.
>
> Importantly, a parser generated by ctpg is statically typed as one of
> the examples, "parsing simple arithmetic expression" shows.
> Generally a parser generated by other tool and accepting tokens returns
> the abstract syntax tree, but it return the evaluated value in the example.
> In other words, it does lexical analysis, parsing and (type) converting
> at a time.
> If you want simply abstract syntax tree, it may be a little pain to use
> ctpg.
>
> That's all I'd like to say about ctpg here.
>
> By the way, I wholeheartedly agree with Andrei's opinion. But currently,
> I think, CTFE is limited because of this issue:
> http://d.puremagic.com/issues/show_bug.cgi?id=6498 .
> Without dealing with this issue, We couldn't bring out the full
> potential of CTFE.
>
> Thanks,
> Hisayuki Mima

A minimum of documentation is now available here:
https://github.com/youkei/ctpg/wiki/Home-en

Hisayuki Mima


More information about the Digitalmars-d mailing list