Writing a Parser - aPaGeD comments
Alan Knowles
alan at akbkhome.com
Tue Jan 8 16:10:52 PST 2008
I did spend some time looking at aPaGeD for this yesteday - here's some
feedback that may help (and if you can answer some of the questions that
would help me alot as well)
- It' works (yes, but you would be amazed, for the life of me I could
not get antlr to do anything - java write once, run nowhere...) - so
being able to generate code and working programs from the examples was
great, and a real +100 for the project..
- Syntax - on the face of it looks reasonable, and easy to understand. -
similar enough to antlr..
(That's the end of the really good stuff)
- Documentation
While I know it's a pain to write, the things you have already tend to
focus on how the parser is built, and are biased to someone
understanding the internals and phrase-ology involved in parsers, rather
than an end user - who just knows if I'm looking for this.. - then put
this, and the result is available in these variables:
Specifically I've no idea what the meanings of these are, and they are
rather critical to the docs....:
"Terminal" "Non-Terminal"
- Regex's
While I can see the benefit's I'd much rather the compiler built them
for me.. - part of the beauty of the BNF format is that it's easy to
read, and explains regex style situations alot better.. - Otherwise (see
below about explaining how they can deal with classic situations...)
- How to handle classic situations
This is the key to success for the Documentation. (and what is seriously
missing) - as most people will probably have come from a lexx/yacc
background...
These are a few classic examples that the Documentation could do with.
* Top level parser starts.
Most grammers start with a top level statement, eg.
Program:
Statements;
In which case the application should only start by solving Statements, -
the biggest problem I found was that I had no idea how to stop it
matching any of the condition rules (that were only relivant to a
specific state - eg. see next example)
* Parse a string
This is a common pattern but it's quite difficult to see how to
implement it. -- And as above, when I tried, the parser started matching
DoubleQuotedStringChars at the start of the file (even though it's only
used in DoubleQuotedString.
DoubleQuotedString;
QUOTE DoubleQuotedStringChars QUOTE
DoubleQuotedStringChars:
(DoubleQuotedStringChar)*
DoubleQuotedStringChar:
"\" ANYCHAR:
^QUOTE;
* Classic groupings:
(.....)* eg. many of these matches..
(.....)+ eg. one or more of these matches..
(.....)? eg. one or none of these matches..
(.....)=> ... if forward lookup succeeds on (...) try match next combo.
Regards
Alan
Jascha Wetzel wrote:
> Dan wrote:
>> I've been messing with how to write a parser, and so far I've played
>> with numerous patterns before eventually wanting to cry.
>>
>> At the moment, I'm trying recursive descent parsing.
>>
>> The problem is that I've realized I'm duplicating huge volumes of code
>> to cope with the tristate decision of { unexpected, allow, require }
>> for any given token.
>>
>> For example, to consume a for loop, you consume something similar to
>> /for\s*\((.*?)\)\s*\{(.*?)\}/
>>
>> I have it doing that, but my soul feels heavy with the masses of
>> looped switches it's doing. Is there any way to ease the pain?
>
> a parser generator :)
> writing a parser or scanner manually is a bit like writing any program
> in assembler - tedious, error-prone and not well maintainable. there's a
> lot of stuff in a parser that can be automatically generated.
> even if you want to write the parser all by yourself, i'd rather suggest
> you write a simple parser generator to do that tedious part for you.
More information about the Digitalmars-d-learn
mailing list