Enhanced LALR(1) Parser Generator for D

Tue Apr 9 19:28:34 PDT 2013

As an exercise to help me learn D, I've implemented an enhanced version 
of the LALR(1) parser generator described in Aho, Sethi and Ullman's 
dragon book.  The enhancements include:

1. a built in lexical analyser (where a flex like pattern is required as 
part of the definition of a token),

2. the ability to resolve reduce/reduce conflicts using predicates 
attached to grammar rules (inspired by the work of Ganapathi and Fischer 
in the 80s), and

3. for "literal" tokens the option of using the literal pattern in 
grammar rules instead of the token name e.g. if you had a token named 
LESSOREQUAL with the pattern "<=" then you can "<=" in grammar rules 
instead of LESSOREQUAL.

The code for this tool resides in the dunnart repository at GitHub.

Unfortunately, at this stage, the documentation is not very good but a 
small example that illustrates all features is attached.

As I said at the start, this was an exercise in learning D so there may 
be room for improvement in some of the code.  Suggestions are welcome.

Cheers
Peter
-------------- next part --------------
%{
import std.stdio;
double[string] variables;
enum Errors { undefinedVariables = 1, divideByZero = 2, syntaxError = 3 };
uint errors;

void report_errors()
{
    auto report = "Errors:";
    if (errors & Errors.undefinedVariables) {
        report ~= " \"Undefined Variables\"";
    }
    if (errors & Errors.divideByZero) {
        report ~= " \"Divide by Zero\"";
    }
    if (errors & Errors.syntaxError) {
        report ~= " \"Syntax Errors\"";
    }
    stderr.writeln(report);
}
%}

%field  double value
%field  string id

%token          EOL     (\n)
%token          PLUS    "+"
%token          MINUS   "-"
%token          TIMES   "*"
%token          DIVIDE  "/"
%token          ASSIGN  "="
%token  <value> NUMBER  ([0-9]+(\.[0-9]+){0,1})
%token  <id>    ID      ([a-zA-Z]+)
%token          LPR     "("
%token          RPR     ")"

%skip   ([\t\r ]+)

%right  UMINUS
%left   TIMES DIVIDE
%left   PLUS MINUS
%left   EOL

%%
line: setup expr ?(errors > 0?) !{report_errors();!}
    | setup expr !{writeln($2.value);!}
    | setup ID "=" expr ?(errors == 0?) !{variables[$2.id] = $4.value;!}
    | setup ID "=" expr !{report_errors();!}
    | line EOL line
    | line EOL
    .

setup: !{errors = 0;!}.

expr: expr "+" expr ?($1.value == 0?) !{$$.value = $3.value;!}
    | expr "+" expr ?($3.value == 0?) !{$$.value = $1.value;!}
    | expr "+" expr !{$$.value = $1.value + $3.value;!}
    | expr "-" expr ?($1.value == 0?) !{$$.value = -$3.value;!}
    | expr "-" expr ?($3.value == 0?) !{$$.value = $1.value;!}
    | expr "-" expr !{$$.value = $1.value - $3.value;!}
    | expr "*" expr ?($1.value == 0 || $3.value == 0?) !{$$.value = -$3.value;!}
    | expr "*" expr ?($1.value == 1?) !{$$.value = $3.value;!}
    | expr "*" expr ?($3.value == 1?) !{$$.value = $1.value;!}
    | expr "*" expr !{$$.value = $1.value * $3.value;!}
    | expr "/" expr ?($3.value == 1?) !{$$.value = $1.value;!}
    | expr "/" expr ?($3.value == 0?) !{errors |= Errors.divideByZero;!}
    | expr "/" expr ?($1.value == 0?) !{$$.value = 0;!}
    | expr "/" expr !{$$.value = $1.value / $3.value;!}
    | "(" expr ")" !{$$.value = $2.value;!}
    | "-" expr %prec UMINUS !{$$.value = $2.value;!}
    | NUMBER !{$$.value = $1.value;!}
    | ID ?($1.id in variables?) !{$$.value = variables[$1.id];!}
    | ID !{errors |= Errors.undefinedVariables; $$.value = 0;!}
    | %error !{errors |= Errors.syntaxError;!}
    .