Pegged, a Parsing Expression Grammar (PEG) generator in D

Alex Rønne Petersen xtzgzorex at gmail.com
Sat Mar 10 15:34:19 PST 2012


On 11-03-2012 00:28, Philippe Sigaud wrote:
> Hello,
>
> I created a new Github project, Pegged, a Parsing Expression Grammar
> (PEG) generator in D.
>
> https://github.com/PhilippeSigaud/Pegged
>
> docs: https://github.com/PhilippeSigaud/Pegged/wiki
>
> PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar
>
> The idea is to give the generator a PEG with the standard syntax. From
> this grammar definition, a set of related parsers will be created, to be
> used at runtime or compile time.
>
> Usage
> -----
>
> To use Pegged, just call the `grammar` function with a PEG and mix it
> in. For example:
>
>
> import pegged.grammar;
>
> mixin(grammar("
> Expr <- Factor AddExpr*
> AddExpr <- ('+'/'-') Factor
> Factor <- Primary MulExpr*
> MulExpr <- ('*'/'/') Primary
> Primary <- Parens / Number / Variable / '-' Primary
>
> Parens <- '(' Expr ')'
> Number <~ [0-9]+
> Variable <- Identifier
> "));
>
>
>
> This creates the `Expr`, `AddExpr`, `Factor` (and so on) parsers for
> basic arithmetic expressions with operator precedence ('*' and '/' bind
> stronger than '+' or '-'). `Identifier` is a pre-defined parser
> recognizing your basic C-style identifier. Recursive or mutually
> recursive rules are OK (no left recursion for now).
>
> To use a parser, use the `.parse` method. It will return a parse tree
> containing the calls to the different rules:
>
> // Parsing at compile-time:
> enum parseTree1 = Expr.parse("1 + 2 - (3*x-5)*6");
>
> pragma(msg, parseTree1.capture);
> writeln(parseTree1);
>
> // And at runtime too:
> auto parseTree2 = Expr.parse(" 0 + 123 - 456 ");
> assert(parseTree2.capture == ["0", "+", "123", "-", "456"]);
>
>
>
> Features
> --------
>
> * The complete set of PEG operators are implemented
> * Pegged can parse its input at compile time and generate a complete
> parse tree at compile time. In a word: compile-time string (read: D
> code) transformation and generation.
> * You can parse at runtime also, you lucky you.
> * Use a standard and readable PEG syntax as a DSL, not a bunch of
> templates that hide the parser in noise.
> * But you can use expression templates if you want, as parsers are all
> available as such. Pegged is implemented as an expression template, and
> what's good for the library writer is sure OK for the user too.
> * Some useful additional operators are there too: a way to discard
> matches (thus dumping them from the parse tree), to push captures on a
> stack, to accept matches that are equal to another match
> * Adding new parsers is easy.
> * Grammars are composable: you can put different
> `mixin(grammar(rules));` in a module and then grammars and rules can
> refer to one another. That way, you can have utility grammars providing
> their functionalities to other grammars.
> * That's why Pegged comes with some pre-defined grammars (JSON, etc).
> * Grammars can be dumped in a file to create a D module.
>
> More advanced features, outside the standard PEG perimeter are there to
> bring more power in the mix:
>
> * Parametrized rules: `List(E, Sep) <- E (Sep E)*` is possible. The
> previous rule defines a parametrized parser taking two other parsers
> (namely, `E` and `Sep`) to match a `Sep`-separated list of `E`'s.
> * Named captures: any parser can be named with the `=` operator. The
> parse tree generated by the parser (so, also its matches) is delivered
> to the user in the output. Other parsers in the grammar see the named
> captures too.
> * Semantic actions can be added to any rule in a grammar. Once a rule
> has matched, its associated action is called on the rule output and
> passed as final result to other parsers further up the grammar. Do what
> you want to the parse tree. If the passed actions are delegates, they
> can access external variables.
>
>
> Philippe
>

Admittedly I have not heard of PEGs before, so I'm curious: Is this 
powerful enough to parse a language such as C?

-- 
- Alex


More information about the Digitalmars-d-announce mailing list