Writing a Parser

Fri Jan 4 20:22:23 PST 2008

"Dan" <murpsoft at hotmail.com> wrote in message 
news:flmtrv$2jrn$1 at digitalmars.com...
>
> I've been messing with how to write a parser, and so far I've played with 
> numerous patterns before eventually wanting to cry.
>
> At the moment, I'm trying recursive descent parsing.
>
> The problem is that I've realized I'm duplicating huge volumes of code to 
> cope with the tristate decision of { unexpected, allow, require } for any 
> given token.
>
> For example, to consume a for loop, you consume something similar to
> /for\s*\((.*?)\)\s*\{(.*?)\}/
>
> I have it doing that, but my soul feels heavy with the masses of looped 
> switches it's doing.  Is there any way to ease the pain?

Separate tokenization and syntax parsing?  It makes things a hell of a lot 
easier.  You don't even necessarily have to tokenize the source entirely 
before parsing; just have a lexer which lexes tokens out of the source on 
demand.  The syntax parsing is then unencumbered from dealing with the raw 
source and just has to do stuff like "expect 'for', expect left-paren, 
expect (your condition), expect right-paren" etc.