Let's stop parser Hell

Philippe Sigaud philippe.sigaud at gmail.com
Tue Jul 31 23:12:21 PDT 2012


On Wed, Aug 1, 2012 at 7:48 AM, Dmitry Olshansky <dmitry.olsh at gmail.com> wrote:
>> Well,
>>
>> - for a lexer lookahead is sometimes useful (the Dragon book cite the
>> FORTRAN grammar, for which keywords are not reserved and so when you
>> encounter IF, you don't know if (!) it's a function call or a 'real'
>> if)
>
>
> Well while lookahead will help, there are simpler ways. e.g.
> regex ("IF\ZYOUR_LOOKAHEAD");
>
> \Z means that you are capturing text only up to \Z. It's still regular. I
> think Boost regex does support this. We could have supported this cleanly
> but have chosen ECM-262 standard.
>
>
> Critical difference is that lookahead can be used like this:
>
> regex("blah(?=lookahead)some_other_stuff_that_lookahead_also_checks");

In the FORTRAN case, you indeed *need* to re-lex the stuff after IF,
with another regex, once you've determined it's an IF instruction and
not some moron who used IF as an identifier.

You know, as a first step, I'd be happy to get ctRegex to recognize the \Z flag.


More information about the Digitalmars-d mailing list