std.d.lexer: pre-voting review / discussion

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Sep 12 07:08:22 PDT 2013


On Thu, Sep 12, 2013 at 07:00:09AM +0200, deadalnix wrote:
> On Thursday, 12 September 2013 at 03:37:55 UTC, H. S. Teoh wrote:
> >I still don't understand why backtracking is necessary in the first
> >place. I would've thought a modern parser should be well able to
> >encode seen tokens in its state so that backtracking is never
> >necessary.  Or does D grammar have tricky bits that cannot be handled
> >this way, that I'm unaware of?
> >
> 
> The problem is that it can cause a exponential (and I literally mean
> exponential here) amount of complexity.
> 
> In some cases, the complexity is manageable, but in other that don't
> make any sense (it has to be noted that even full lexing don't make
> any sens here). For instance :
> 
> int foo()() {}
>        ^
> 
> When you are at the caret position, you don't know if you face a
> function declaration or a template declaration. You could go for
> some ambiguous parsing, but each template argument can itself be a
> type, an expression or a symbol.
[...]

This can be handled by using an intermediate grammar rule. Reduce all
(...) into an intermediate type, say ArgList, so the reduction
happens something like this:

	int  foo   ()      ()      {}
	Type Ident ArgList ArgList ^

Then have the rule:

	CompileTimeArgs ::= ArgList
	RuntimeArgs ::= ArgList
	TemplateDecl ::= Type Ident CompileTimeArgs RuntimeArgs '{' ...
	FuncDecl ::= Type Ident RuntimeArgs '{' ...

So first, all (...) gets parsed to ArgList, but it's not yet fixed
whether they are compile-time arguments or runtime arguments. It's only
after you see the next '(' or '{' that you decide whether ArgList should
reduce to CompileTimeArgs or RuntimeArgs.

ArgList itself, of course, will accept all possible parameters (both
runtime and compile-time): types, expressions, symbols. Then when you
reduce it to RuntimeArgs, you reject stuff that can't be interpreted as
parameter declarations.


T

-- 
There are two ways to write error-free programs; only the third one works.


More information about the Digitalmars-d mailing list