Pegged, a Parsing Expression Grammar (PEG) generator in D

Alex Rønne Petersen xtzgzorex at gmail.com
Sun Mar 11 08:13:12 PDT 2012


On 11-03-2012 16:02, Alex Rønne Petersen wrote:
> On 11-03-2012 00:28, Philippe Sigaud wrote:
>> Hello,
>>
>> I created a new Github project, Pegged, a Parsing Expression Grammar
>> (PEG) generator in D.
>>
>> https://github.com/PhilippeSigaud/Pegged
>>
>> docs: https://github.com/PhilippeSigaud/Pegged/wiki
>>
>> PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar
>>
>> The idea is to give the generator a PEG with the standard syntax. From
>> this grammar definition, a set of related parsers will be created, to be
>> used at runtime or compile time.
>>
>> Usage
>> -----
>>
>> To use Pegged, just call the `grammar` function with a PEG and mix it
>> in. For example:
>>
>>
>> import pegged.grammar;
>>
>> mixin(grammar("
>> Expr <- Factor AddExpr*
>> AddExpr <- ('+'/'-') Factor
>> Factor <- Primary MulExpr*
>> MulExpr <- ('*'/'/') Primary
>> Primary <- Parens / Number / Variable / '-' Primary
>>
>> Parens <- '(' Expr ')'
>> Number <~ [0-9]+
>> Variable <- Identifier
>> "));
>>
>>
>>
>> This creates the `Expr`, `AddExpr`, `Factor` (and so on) parsers for
>> basic arithmetic expressions with operator precedence ('*' and '/' bind
>> stronger than '+' or '-'). `Identifier` is a pre-defined parser
>> recognizing your basic C-style identifier. Recursive or mutually
>> recursive rules are OK (no left recursion for now).
>>
>> To use a parser, use the `.parse` method. It will return a parse tree
>> containing the calls to the different rules:
>>
>> // Parsing at compile-time:
>> enum parseTree1 = Expr.parse("1 + 2 - (3*x-5)*6");
>>
>> pragma(msg, parseTree1.capture);
>> writeln(parseTree1);
>>
>> // And at runtime too:
>> auto parseTree2 = Expr.parse(" 0 + 123 - 456 ");
>> assert(parseTree2.capture == ["0", "+", "123", "-", "456"]);
>>
>>
>>
>> Features
>> --------
>>
>> * The complete set of PEG operators are implemented
>> * Pegged can parse its input at compile time and generate a complete
>> parse tree at compile time. In a word: compile-time string (read: D
>> code) transformation and generation.
>> * You can parse at runtime also, you lucky you.
>> * Use a standard and readable PEG syntax as a DSL, not a bunch of
>> templates that hide the parser in noise.
>> * But you can use expression templates if you want, as parsers are all
>> available as such. Pegged is implemented as an expression template, and
>> what's good for the library writer is sure OK for the user too.
>> * Some useful additional operators are there too: a way to discard
>> matches (thus dumping them from the parse tree), to push captures on a
>> stack, to accept matches that are equal to another match
>> * Adding new parsers is easy.
>> * Grammars are composable: you can put different
>> `mixin(grammar(rules));` in a module and then grammars and rules can
>> refer to one another. That way, you can have utility grammars providing
>> their functionalities to other grammars.
>> * That's why Pegged comes with some pre-defined grammars (JSON, etc).
>> * Grammars can be dumped in a file to create a D module.
>>
>> More advanced features, outside the standard PEG perimeter are there to
>> bring more power in the mix:
>>
>> * Parametrized rules: `List(E, Sep) <- E (Sep E)*` is possible. The
>> previous rule defines a parametrized parser taking two other parsers
>> (namely, `E` and `Sep`) to match a `Sep`-separated list of `E`'s.
>> * Named captures: any parser can be named with the `=` operator. The
>> parse tree generated by the parser (so, also its matches) is delivered
>> to the user in the output. Other parsers in the grammar see the named
>> captures too.
>> * Semantic actions can be added to any rule in a grammar. Once a rule
>> has matched, its associated action is called on the rule output and
>> passed as final result to other parsers further up the grammar. Do what
>> you want to the parse tree. If the passed actions are delegates, they
>> can access external variables.
>>
>>
>> Philippe
>>
>
> By the way, bootstrap.d seems to fail to build at the moment:
>
> .../pegged/utils/bootstrap.d(1433): found ':' when expecting ')'
> following template argument list
> .../pegged/utils/bootstrap.d(1433): members expected
> .../pegged/utils/bootstrap.d(1433): { } expected following aggregate
> declaration
> .../pegged/utils/bootstrap.d(1433): semicolon expected, not '!'
> .../pegged/utils/bootstrap.d(1433): Declaration expected, not '!'
> .../pegged/utils/bootstrap.d(1466): unrecognized declaration
>

Also, I have sent a pull request to fix the build on 64-bit: 
https://github.com/PhilippeSigaud/Pegged/pull/1

-- 
- Alex


More information about the Digitalmars-d-announce mailing list