Lexer and parser generators using CTFE

Nick Sabalausky a at a.a
Tue Feb 28 21:15:46 PST 2012


"H. S. Teoh" <hsteoh at quickfur.ath.cx> wrote in message 
news:mailman.215.1330472867.24984.digitalmars-d at puremagic.com...
> On Tue, Feb 28, 2012 at 10:03:42PM +0100, Martin Nowak wrote:
> [...]
>> I won't deny that the combination of CTFE text processing and static
>> introspection could improve on this. It could be made more feasible by
>> some conventions, e.g. parse result always uses structs or classes and
>> built-in arrays.
>
> Excellent idea, I like this.
>
>
>> class Module
>> {
>>     this(Declaration[]);
>> }
>>
>> class StructDeclaration : Declaration
>> {
>>     enum _enbf = "struct $1=Identifier { $2=Declaration* }";
>>
>>     this(Identifier, Declaration[]);
>> }
>>
>> ...
>>
>> Parser!(Module, StructDeclaration, ...) parser;
>> Module m = parser.parse(read("file.d"));
>
> I like this! Definitely an improvement over yacc syntax.
>

In Goldie, I've taken an inverted approach, which IMHO is easier to use: The 
types are automatically generated from the grammar, not the other way 
around. So applying that approach to the above code, it'd be more like this:

mixin genGrammar!("myGrammar", `
    Identifier = [a-zA-Z_][a-zA-Z_0-9]*
    Module = Declaration+
    Declaration = StructDeclaration
    StructDeclaration = 'struct' Identifier '{' Declaration* '}'
`);

Which generates these classes:

Parser!"myGrammar"
Symbol!("myGrammar.Identifier")
Symbol!("myGrammar.Module")
Symbol!("myGrammar.Declaration")
Symbol!("myGrammar.StructDeclaration")

and/or these:

Parser_myGrammar
Symbol_myGrammar!"Identifier"
Symbol_myGrammar!"Module"
Symbol_myGrammar!"Declaration"
Symbol_myGrammar!"StructDeclaration"

would could then be aliased by the user however they wanted:

alias Symbol_myGrammar MySym;

And there can still be hooks (delegates, subclassing, whatever) to add 
customized behavior/functionality.




More information about the Digitalmars-d mailing list