Overlapping functionality: IFTI, templates, is-expressions

BCS BCS at pathlink.com
Thu Mar 20 17:12:22 PDT 2008


Russell Lewis wrote:
 >
> I have an idea that didn't work with my design, but it probably would 
> with yours: you should be able to do a hybrid parser with bottom-up 
> parsings of unambiguous strings.  Basically, the concept is that you 
> look for certain tokens (or strings of tokens) which are only used in a 
> certain reduction.  Then, you build a low-level scanner which scans the 
> lexer's output (utterly without any understanding of the larger 
> contexts) and collapses any patterns that it recognizes.  You can 
> perform multiple passes, and collapse even strings of things that 
> include nonterminals, if you get lucky.  Then, once you run out of 
> bottom-up parsings, you run the top-down parser.  Things that were 
> bottom-up parsed are effectively memoized.
> 

Actually this could be implemented under underneath, beside or on top of 
mine (I'm not sure what the correct way to describe it it). The lexer 
object could be passed into a different function in the same struct as 
the parser that does what you describe. Because each of the non 
terminals has a publicly callable function for it, the mixed parser idea 
would be easy. (automating the extraction of the available rules would 
be a trick though)

> There are other tricks you can use, too, by mixing the bottom-up and 
> top-down parsers.  For instance, in D the token ? is (as far as I 
> remember) only used as part of a conditional expression.  Your parser 
> library could detect that in the grammar, and say, "whenever I see ? in 
> the bottom-up parser scan, I will run the top-down parser to parse an 
> 'expression' nonterminal as the next thing, then look for a ':' token 
> and then do it again."  Likewise, '&' only comes before an expression, 
> as do all of the assignment operators and most of the binary operators. 
>  In most of these cases, you don't have any efficient way to bottom-up 
> parse the stuff to the *left* of the key-token,

that's no problem, just use an RR parser rather than an LL one. The only 
issue would be that a lexer that can also walk backwards is more 
complicated.

> but you can at least 
> memoize the stuff on the right.
> 
> Of course, it only works on grammars that happen to have unambiguous 
> opportunities, but in D, there are a lot of those.



More information about the Digitalmars-d mailing list