Current D grammar

Nick Sabalausky via Digitalmars-d digitalmars-d at puremagic.com
Tue Jun 16 00:03:12 PDT 2015


On 06/14/2015 08:05 PM, Manfred Nowak wrote:
> With my favorite LALR-CompilerCompiler I analyzed the current D grammar.
> After preliminary eliminating the RR-conflicts around 1800 states remained,
> from which around 100 states are still contaminated with SR-conflicts.
>
> Two possibilities:
> 1) Those ambiguities are not in DMD but introduced by excerpting the
> grammar from DMD
> 2) Those ambiguities do exist in DMD.
>
> Which one do you prefer? Is the grammar usable?
>

Although I don't recall details, when I attempted it a few years back, I 
came to the conclusion that getting a correct grammar for D (and likely 
other non-trivial C-family languages) that was realistically useful is 
essentially impossible in LALR(1). I'm sure you'd be able to do it if 
you upgraded to GLF though (or maybe LALR(k) might work, though I'm not 
sure).

The problem I found with LR is that totally separate branches of the 
grammar will easily conflict with each other (whereas in LL separate 
branches are completely independent). That leads to tree patterns that 
work fine in LL but make LR/LALR barf unless you do major contortions to 
get around the conflicts.

GLR looked like it should be able to overcome those conflicts with ease 
(although I don't know how the efficiency would compare to LL).

With LR and LALR, you can *theoretically* you can get around all those 
conflicts for the academic sake of "Does this input satisfy the grammar 
or not? Yes or no?". But again, it requires contorting the grammar. And 
unfortunately, the more you have to contort it, the less useful the 
parse tree becomes for real-world purposes beyond just "The input 
satisfies/doesn't satisfy the grammar".



More information about the Digitalmars-d mailing list