Pegged, a Parsing Expression Grammar (PEG) generator in D

Alex Rønne Petersen xtzgzorex at gmail.com
Sun Mar 11 18:29:00 PDT 2012


On 12-03-2012 02:27, Alex Rønne Petersen wrote:
> On 11-03-2012 00:28, Philippe Sigaud wrote:
>> Hello,
>>
>> I created a new Github project, Pegged, a Parsing Expression Grammar
>> (PEG) generator in D.
>>
>> https://github.com/PhilippeSigaud/Pegged
>>
>> docs: https://github.com/PhilippeSigaud/Pegged/wiki
>>
>> PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar
>>
>> The idea is to give the generator a PEG with the standard syntax. From
>> this grammar definition, a set of related parsers will be created, to be
>> used at runtime or compile time.
>>
>> Usage
>> -----
>>
>> To use Pegged, just call the `grammar` function with a PEG and mix it
>> in. For example:
>>
>>
>> import pegged.grammar;
>>
>> mixin(grammar("
>> Expr <- Factor AddExpr*
>> AddExpr <- ('+'/'-') Factor
>> Factor <- Primary MulExpr*
>> MulExpr <- ('*'/'/') Primary
>> Primary <- Parens / Number / Variable / '-' Primary
>>
>> Parens <- '(' Expr ')'
>> Number <~ [0-9]+
>> Variable <- Identifier
>> "));
>>
>>
>>
>> This creates the `Expr`, `AddExpr`, `Factor` (and so on) parsers for
>> basic arithmetic expressions with operator precedence ('*' and '/' bind
>> stronger than '+' or '-'). `Identifier` is a pre-defined parser
>> recognizing your basic C-style identifier. Recursive or mutually
>> recursive rules are OK (no left recursion for now).
>>
>> To use a parser, use the `.parse` method. It will return a parse tree
>> containing the calls to the different rules:
>>
>> // Parsing at compile-time:
>> enum parseTree1 = Expr.parse("1 + 2 - (3*x-5)*6");
>>
>> pragma(msg, parseTree1.capture);
>> writeln(parseTree1);
>>
>> // And at runtime too:
>> auto parseTree2 = Expr.parse(" 0 + 123 - 456 ");
>> assert(parseTree2.capture == ["0", "+", "123", "-", "456"]);
>>
>>
>>
>> Features
>> --------
>>
>> * The complete set of PEG operators are implemented
>> * Pegged can parse its input at compile time and generate a complete
>> parse tree at compile time. In a word: compile-time string (read: D
>> code) transformation and generation.
>> * You can parse at runtime also, you lucky you.
>> * Use a standard and readable PEG syntax as a DSL, not a bunch of
>> templates that hide the parser in noise.
>> * But you can use expression templates if you want, as parsers are all
>> available as such. Pegged is implemented as an expression template, and
>> what's good for the library writer is sure OK for the user too.
>> * Some useful additional operators are there too: a way to discard
>> matches (thus dumping them from the parse tree), to push captures on a
>> stack, to accept matches that are equal to another match
>> * Adding new parsers is easy.
>> * Grammars are composable: you can put different
>> `mixin(grammar(rules));` in a module and then grammars and rules can
>> refer to one another. That way, you can have utility grammars providing
>> their functionalities to other grammars.
>> * That's why Pegged comes with some pre-defined grammars (JSON, etc).
>> * Grammars can be dumped in a file to create a D module.
>>
>> More advanced features, outside the standard PEG perimeter are there to
>> bring more power in the mix:
>>
>> * Parametrized rules: `List(E, Sep) <- E (Sep E)*` is possible. The
>> previous rule defines a parametrized parser taking two other parsers
>> (namely, `E` and `Sep`) to match a `Sep`-separated list of `E`'s.
>> * Named captures: any parser can be named with the `=` operator. The
>> parse tree generated by the parser (so, also its matches) is delivered
>> to the user in the output. Other parsers in the grammar see the named
>> captures too.
>> * Semantic actions can be added to any rule in a grammar. Once a rule
>> has matched, its associated action is called on the rule output and
>> passed as final result to other parsers further up the grammar. Do what
>> you want to the parse tree. If the passed actions are delegates, they
>> can access external variables.
>>
>>
>> Philippe
>>
>
> Hm, since ' is used in the grammar of Pegged, how do I express it in my
> grammar spec? Is there a predefined rule for it, or is \' supposed to work?
>

Ah, Quote it is. I also noticed there is DoubleQuote. Is " used for 
anything in the Pegged grammar?

-- 
- Alex


More information about the Digitalmars-d-announce mailing list