Lexer and parser generators using CTFE

Tue Feb 28 13:03:42 PST 2012

On Tue, 28 Feb 2012 21:02:24 +0100, H. S. Teoh <hsteoh at quickfur.ath.cx>  
wrote:

> On Tue, Feb 28, 2012 at 07:46:04PM +0100, Martin Nowak wrote:
> [...]
>> I wrote a generic lexer generator some time ago.
>> It already let to some compiler O(N^2) optimizations, because the token
>> declarations sneak into the mangling :(.
>> I also finally added a workaround for a remaining CTFE bug (#6815).
>>
>> https://gist.github.com/1255439 - lexer generator
>> https://gist.github.com/1262321 - complete and fast D lexer
>
> Cool! I'll have to take a look at this sometime.
>
>
> [...]
>> <PERSONAL OPINION
>> The hassle of providing good error messages and synthesizing parse  
>> results
>> in a generic parser outweigh the benefit of a declarative grammar.
>> /PERSONAL OPINION>
>
> But things like lex/yacc have been useful throughout the years. With D's
> delegates, lexer/parser action rules should be even cleaner, no?
>
Yacc does work but at the price of an additional build step and total  
automaton obfuscation.
And even at that price the results are still hard to maintain klingon  
sources.
http://supercollider.git.sourceforge.net/git/gitweb.cgi?p=supercollider/supercollider;a=blob;f=lang/LangSource/Bison/lang11d

I won't deny that the combination of CTFE text processing and static  
introspection could
improve on this. It could be made more feasible by some conventions, e.g.  
parse result
always uses structs or classes and built-in arrays.

----

class Module
{
     this(Declaration[]);
}

class StructDeclaration : Declaration
{
     enum _enbf = "struct $1=Identifier { $2=Declaration* }";

     this(Identifier, Declaration[]);
}

...

Parser!(Module, StructDeclaration, ...) parser;
Module m = parser.parse(read("file.d"));