[Issue 1466] Spec claims maximal munch technique always works: not for "1..3"

Mon Sep 10 03:26:20 PDT 2007

Matti Niemenmaa wrote:
> Jascha Wetzel wrote:
>> Jascha Wetzel wrote:
>>> d-bugmail at puremagic.com wrote:
>>>>         // thinks it's [0 ... 1], no maximal munch taking place
>>>>         assert (Foo[0... 1] == 0);
>>>> }
>>> this *is* maximal munch taking place. because of the ".." lexeme,
>>> float literals are not lexemes. they are context free production rules
>>> consisting of multiple lexemes. therefore "0." consists of two lexemes
>>> and "..." wins the max munch over ".".
>> this was formulated poorly. float literals *may* be considered context
>> free to solve this problem...
> 
> Exactly. But the way I read the spec, float literals are considered tokens in
> and of themselves. Maybe I misunderstand, but it could use some clarification in
> that case: lex.html specifically says that the lexer splits the code into
> tokens, one of which is "0.", with maximal munch.
> 
> This isn't a /problem/ per se. In the extreme case, of course it is possible to
> parse D with maximal munch by considering every character a lexeme of its own
> and figuring everything else out thereafter. I'm just saying that the spec seems
> to contradict itself in saying that maximal munch should be used, and that it
> should match (among other things) "0." as one token. If you do things that way,
> it doesn't work the way DMD currently does it. If you match "0." as "0" and "."
> and construct a float literal from that later, it works.

i agree that it's not clear. one can argue, though, that if you consider 
lookaheads as part of the lexeme specification, the maximal munch 
property remains intact. then "0." can only match if not followed by 
another "." - no contradiction to the max munch. but that should be 
stated in the specs, of course.