std.d.lexer requirements

Thu Aug 2 03:07:32 PDT 2012

On 8/2/2012 2:27 AM, Piotr Szturmaj wrote:
> Walter Bright wrote:
>> 1. It should accept as input an input range of UTF8. I feel it is a
>> mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
>> UTF16 should use an 'adapter' range to convert the input to UTF8. (This
>> is what component programming is all about.)
>
> Why it is a mistake?

Because the lexer is large and it would have to have a lot of special case code 
inserted here and there to make that work.

> I think Lexer should parse any UTF range and return
> compatible token's strings. That is it should provide strings for UTF8 input,
> wstrings for UTF16 input and so on.

Why? I've never seen any UTF16 or UTF32 D source in the wild.

Besides, if it is not templated then it doesn't need to be recompiled by every 
user of it - it can exist as object code in the library.