D parsing
Martin Nowak
code at dawg.eu
Tue Nov 5 10:57:16 PST 2013
On 10/30/2013 11:39 PM, jgetner wrote:
> Is there a function in D for evaluating expressions before compile time?.
I want to throw two more ideas in the parser generator topic which
I carry around for quite some time. I think combined they'd enable AST
generation from clean and simple grammars. Maybe someone find this
interesting enough to give it a try, I hardly will ever get to this.
There is an interesting paper [MTP] that describes how a slight
modification of the ENBF grammar can be used to generate a usable AST
from the grammar definition. Each AST node is named like it's
production. This is possible because the modified grammar disallows to
mix alternates and sequences which forces one to name all productions.
The other paper is [GLL] parsing which is a new general parser
generation technique. The main advantage over GLR, it's a top-down
parser. This allows to modularize grammars, e.g. you can define a
grammar that reuses/imports an existing grammar for expressions.
The reason to choose generalized parsers instead of say LALR(1)
is maintainability of the grammars. The needed modifications to make
a context free grammar LR(1) or LALR(1)-compatible often make them
extremely hard to read and change.
Parse forests (many parse trees) are not a problem I think because most
CFG only have local ambiguities so they resolve to a single parse tree.
### A motivating example (desiderata)
grammar.d
```
import number;
Identifier = "[_a-zA-Z][a-zA-Z]*";
// This still needs some syntax to annotate precedence and associativity
MulExpr = Expr "*" Expr;
DivExpr = Expr "/" Expr;
AddExpr = Expr "+" Expr;
SubExpr = Expr "-" Expr;
PowExpr = Expr "^^" Expr;
Expr = MulExpr | DivExpr | AddExpr | SubExpr | PowExpr | Number | Identifer;
Arguments = Expr % ",";
ExprStmt = Expr ";";
CallStmt = Identifier "(" Arguments ")";
Stmt = ExprStmt | CallStmt;
Stmts = Stmt*;
```
```d
auto p = genParser(import("grammar.d"));
auto stmts = p.parseStmts(input);
autp expr = p.parseExpr(input);
foreach (arg; p.parseArguments(input))
{}
final switch (expr)
{
case MulExpr.tag:
case DivExpr.tag:
case AddExpr.tag:
case SubExpr.tag:
case PowExpr.tag:
}
static assert(typeof(MulExpr == Tuple!(Expr, "expr0", Expr, "expr1")));
static assert(typeof(Stmt == Variant!(ExprStmt, CallStmt)));
static assert(typeof(Identifier) == string));
```
[MTP]: http://babel.ls.fi.upm.es/research/mtp/
http://babel.ls.fi.upm.es/~angel/papers/finalcopy-2005prole.pdf
[GLL]: http://www.rhul.ac.uk/computerscience/research/csle/gllparsers.aspx
http://dotat.at/tmp/gll.pdf
for implementers - Modelling GLL Parser Implementations
(http://dx.doi.org/10.1007/978-3-642-19440-5_4)
More information about the Digitalmars-d
mailing list