[GSoC’11] Lexing and parsing
Robert Jacques
sandford at jhu.edu
Wed Mar 23 17:57:02 PDT 2011
On Wed, 23 Mar 2011 13:31:04 -0400, Ilya Pupatenko <pupatenko at gmail.com>
wrote:
>> I'm not qualified to speak on Spirits internal architecture; I've only
>> used it once for something very simple and ran into a one-liner bug
>> which remains unfixed 7+ years later. But the basic API of Spirit would
>> be wrong for D. “it is possible to write a highly-integrated
>> lexer/perser generator in D without resorting to additional tools” does
>> not mean "the library should allow programmer to write grammar directly
>> in D (ideally, the syntax should be somehow similar to EBNF)" it means
>> that the library should allow you to write a grammar in EBNF and then
>> through a combination of templates, string mixins and compile-time
>> function evaluation generate the appropriate (hopefully optimal) parser.
>> D's compile-time programming abilities are strong enough to do the code
>> generation job usually left to separate tools. Ultimately a user of the
>> library should be able to declare a parser something like this:
>>
>> // Declare a parser for Wikipedia's EBNF sample language
>> Parser!`
>> (* a simple program syntax in EBNF − Wikipedia *)
>> program = 'PROGRAM' , white space , identifier , white space ,
>> 'BEGIN' , white space ,
>> { assignment , ";" , white space } ,
>> 'END.' ;
>> identifier = alphabetic character , { alphabetic character | digit } ;
>> number = [ "-" ] , digit , { digit } ;
>> string = '"' , { all characters − '"' } , '"' ;
>> assignment = identifier , ":=" , ( number | identifier | string ) ;
>> alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G"
>> | "H" | "I" | "J" | "K" | "L" | "M" | "N"
>> | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
>> | "V" | "W" | "X" | "Y" | "Z" ;
>> digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
>> white space = ? white space characters ? ;
>> all characters = ? all visible characters ? ;
>> ` wikiLangParser;
>
> Ok, it sounds good. But still in most cases we are not interesting only
> if input text match specified grammar. We want to perform some semantic
> actions while parsing, for example build some kind of AST, evaluate an
> expression and so on. But I have no idea how can I ask this parser to
> perform user-defined actions for example for 'string' and 'number'
> "nodes" in this case.
I don't have any experience with using parser generators, but using arrays
of delegates works really well for GUI libraries. For example:
wikiLangParser.digit ~= (ref wikiLangParser.Token digit) {
auto tokens = digit.tokens;
assert(tokens.length == 1);
digit.value = 0 + (token.front.value.get!string.front - '0');
}
wikiLangParser.number ~= (ref wikiLangParser.Token number) {
auto tokens = number.tokens;
assert(!tokens.empty);
bool negative = false
if(tokens.front.get!string == "-") {
negative = true;
tokens.popFront;
}
int value = 0;
foreach(token; tokens) {
value = value * 10 + token.value.get!int;
}
if(negative)
value = -value;
number.value = value;
}
debug {
wikiLangParser.number ~= (ref wikiLangParser.Token number) {
writeln("Parsed number (",number.value,")");
}
}
More information about the Digitalmars-d
mailing list