D grammar as LL(1) grammar ( OT)

Sun Sep 16 10:43:43 PDT 2007

BLS wrote:
> Sorry for beeing a bit off topic, anyway:
> 
> Do you see a chance to describe D grammar as LL(1) grammar..
> 
> Please notice that I am a more DB developer than a language-designer 
> means keep your answere as simple as possible :)
> 
> all hints are welcome.
> Thanks Bjoern
> 
> Well, at least one guy did it for *Erlang*, as you can see here:
> http://platform.netbeans.org/articles/nbm_interview_caoyuan.html

D needs arbitrary lookahead, that means you can write D code that can be 
parsed with LL(1), some that needs LL(173) and so on. the parser has to 
be able to parse any of that and thus may not have a fixed length lookahead.
For LL parsers that generally means that you have to use backtracking. 
Since DMD uses a hand-written parser, it can optimize the lookahead, 
such that not the whole parser has to work it's way until the point 
where it can decide what to parse, but only a simplified function that 
quickly scans for the relevant token.
E.g. there is a function called peekPastParen (or similar) that just 
finds the first token after a parenthesis and is therefore really fast.

If you generate an LL parser automatically, it's very difficult to apply 
such optimizations. The resulting parser will be considerably slower.
That's the reason i chose to write an LR parser generator to parse D. LR 
parsers use the information provided by a lookahead symbol much more 
efficiently. That is, LR(1) can parse more languages than LL(1), 
although both are just using a lookahead of length 1. Furthermore, LR 
parsers consider multiple potential parse trees at the same time, 
throwing away the wrong guesses, once the necessary tokens get 
available. LL parsers have to check each potential parse tree separately.