[Issue 1466] New: Spec claims maximal munch technique always works: not for "1..3"

Jascha Wetzel "[firstname]" at mainia.de
Sat Sep 1 10:54:07 PDT 2007


BCS wrote:
> Reply to d-bugmail at puremagic.com,
> 
>> http://d.puremagic.com/issues/show_bug.cgi?id=1466
>>
>> Summary: Spec claims maximal munch technique always works:
>> not
>> for "1..3"
>> Product: D
>> Version: 1.020
>> Platform: All
>> URL: http://digitalmars.com/d/1.0/lex.html
>> OS/Version: All
>> Status: NEW
>> Keywords: spec
>> Severity: minor
>> Priority: P3
>> Component: www.digitalmars.com
>> AssignedTo: bugzilla at digitalmars.com
>> ReportedBy: deewiant at gmail.com
>> A snippet from http://digitalmars.com/d/1.0/lex.html:
>>
>> "The source text is split into tokens using the maximal munch
>> technique, i.e., the lexical analyzer tries to make the longest token
>> it can."
>>
>> Relevant parts of the grammar:
>>
>> Token:
>> FloatLiteral
>> ..
>> FloatLiteral:
>> Float
>> Float:
>> DecimalFloat
>> DecimalFloat:
>> DecimalDigits .
>> . Decimal
>> DecimalDigits:
>> DecimalDigit
>> DecimalDigit:
>> NonZeroDigit
>> Decimal:
>> NonZeroDigit
>> Based on the above, if a lexer encounters "1..3", for instance in a
>> slice: "foo[1..3]", it should, using the maximal munch technique, make
>> the longest possible token from "1..3": this is the Float "1.". Next,
>> it should come up with the Float ".3".
>>
>> Of course, this isn't currently happening, and would be problematic if
>> it did. But, according to the grammar, that's what should happen,
>> unless I'm missing something.
>>
>> Either some exception needs to be made or remove the "DecimalDigits ."
>> possibility from the grammar and the compiler.
>>
> 
> or make it "DecimalDigits . [^.]" where the ^ production is non consuming.

it is possible to parse D using a maximal munch lexer - see the seatd 
grammar for an example. it's a matter of what lexemes exactly you 
choose. in this particular case, the float lexemes need to be split, 
such that those floats with a trailing dot are not matched by a single 
lexeme.


More information about the Digitalmars-d-bugs mailing list