std.d.lexer: pre-voting review / discussion

Dominikus Dittes Scherkl Dominikus.Scherkl at continental-corporation.com
Thu Sep 26 08:41:50 PDT 2013


Hello.

I'm not sure if this belongs here, but I think there is bug at 
the very start of the Lexer chapter:

Is U+001A really meant to end the source file?
According to the Unicode specification this is a "replacement 
character", like the newer U+FFFC. Or is it simply a spelling 
error and U+0019 was intended to
end the source (this would fit, as it means "end of media").

I don't know if anybody ever has ended his source in that way or 
if it was tested.

More important to me is, that all the Space-Characters beyond 
ASCII are not
considered whitespace (starting with U+00A0 NBSP, the different 
wide spaces
U+2000 to U+200B up to the exotic stuff U+202F, U+205F, U+2060, 
U+3000 and
the famous U+FEFF). Why?
Ok, the set is much larger, but for the end-of-line also the 
unicode versions (U+2028 and U+2029) are added. This seems 
inconsequent to me.


More information about the Digitalmars-d mailing list