More lexer questions

Timon Gehr timon.gehr at gmx.ch
Sat Feb 11 12:16:29 PST 2012


On 02/11/2012 07:42 PM, H. S. Teoh wrote:
> According to the online specs, the lexer tries to tokenize by maximal
> matching (except for one exception in the case of ranges like "1..2").
> The fact that this exception is stated seems to indicate that it's
> permitted to have two literals side-by-side without an intervening
> space.
>
> So does that mean "1e2" should be tokenized as (float lit: 1e2) and

Yes.

> "1f2" should be tokenized as (int lit: 1)(identifier: f2)?
>

No. maximal munch:

(float lit: 1f)(int lit 2)


> Or, for that matter, "123abcdefg" should be tokenized as (int lit:
> 123)(identifier: abcdefg)

Yes.

> whereas "0x123abcdefg" should be tokenized as
> (int lit: 0x123abcdef)(identifier: g)?
>
> Or worse, if we still allow octals, "0129" should be tokenized as (octal
> lit: 012)(int lit: 9)?
>

DMD views 0129 as an error. Therefore, the best way to handle integer 
literals with initial 0 is to just parse them as decimal and to reject 
them if they exceed 7.

> Or do we expect that any integer/float literal will always span the
> longest string that has characters permitted in any numerical literal,
> and then after the fact the lexer will give an error if the string
> cannot be interpreted as a legal literal? IOW, "0129" will first be
> scanned in its entirety as a numerical literal, then afterwards the
> lexer decides that '9' doesn't belong in an octal so it throws an error
> (as opposed to maximally matching "012" as an octal literal followed by
> a decimal literal "9").  Or, for that matter, "0123xel.u123" will be


(int lit: 0123)(identifier: xel)(token: '.')(identifier: u123)

> scanned as a numerical literal (since all the characters in it occur in
> some kind of numerical literal), and then an error generated after the
> fact when the lexer realizes that this string isn't a legal numerical
> literal?
>
>
> T
>

No. As an example, that kind of processing the code would reject the 
valid token q{0123xel.u123}.



More information about the Digitalmars-d mailing list