Lexer and parser generators using CTFE

Wed Feb 29 11:35:08 PST 2012

On 29.02.2012 20:47, Andrei Alexandrescu wrote:
> On 2/29/12 3:48 AM, Dmitry Olshansky wrote:
>> Someway of stacking grammars is bonus points for the project. As Andrei
>> started with lexer+parser, I see no problem with it being
>> lexer+parser(+parser)*.
>
> What does stacking grammars mean?
>

In general it should be read as using one parser as lexer for another 
parser. But there is some nice ways to integrate another parser within 
recursive descent parser.
Let's assume simple recursive descent parser that uses operator 
precedence grammar. Suppose we have the main grammar:
Declaration = Type Identifier;
Statement = 'if' '(' Expr ')' | Expr
Type = int | float | ...

Where Expr is connected to the second grammar expressed as ...
Ok, let's sketch it, skipping semantic actions:
operatorGrammar!q{
Expr
Unit: Constant | Identifier //assume we have lexer
Operators:
	Not, !, prefix, 4
	UnMinus, -, prefix, 4
	Mul, *, binary, 3
	Mul, /, binary, 3
	Plus, +, binary, 2
	BiMinus, -, binary, 2
	Index, [], brace-like postfix
	Braces, (), brace-like
};

Effectively operator grammar for expressions parses "token" Expr. It's 
easy in recursive descent because it can just try to call this parser, 
then try the other alternative if it fails.
I'm not sure this stacking does add that much, but it's something to 
keep in mind.

> Anyhow, one thing that would become important is tree walkers - e.g. you
> have the tree and you define code for evaluating it etc.
>

Yes, it's a must have if AST is generated generically via annotations.

>> Such a framework relying on mixins is bound to produce awful error
>> messages at compile-time, unless explicitly stuffed up to watterline
>> with some kind of static assert(__traits(compiles,...),"Error x + info");
>
> We need to figure out ways to make that simpler.
>
>
> Andrei

-- 
Dmitry Olshansky