So... let's document dmd
Basile B. via Digitalmars-d
digitalmars-d at puremagic.com
Tue Apr 5 19:59:50 PDT 2016
On Tuesday, 5 April 2016 at 21:37:09 UTC, Walter Bright wrote:
> On 4/5/2016 6:47 AM, Basile B. wrote:
>> Also lexing number doesn't need to be as accurate as the
>> front-end of the compiler (especially if the HL doesnt have a
>> token type for the
>> illegal "lexem".
>
> That is an interesting design point. If I was doing a
> highlighter, I'd highlight in red tokens that the compiler
> would reject, meaning I'd do the accurate number lexing.
>
> Lexing numbers correctly is not trivial, but since the compiler
> lexer's implementation can be cut/pasted, it is trivial in
> practice.
Even if when the most naive lexer see a number and consumes until
a blank, a symbol or an operator, it's clear that this can be
done:
http://i.imgur.com/ehjps04.png
Actually numbers is the only part of the D lexer where errors can
be detected.
There's no possible syntax errors otherwise.
But one thing I forget to say in my previous post is that lexing
can be "multi-pass". The D front-end does everything in a single
pass, for example it direclty detects tokPlusPlus or tokXorEqu,
but actually a multi pass lexer can work in 3 sub phases:
1/ split words
2/ detects token families in the words; identifier, keyword,
operator, etc.
3/ specialize tokens: tokOp.data == "++" -> tokPlusPlus
More information about the Digitalmars-d
mailing list