std.d.lexer requirements

Walter Bright newshound2 at digitalmars.com
Thu Aug 2 00:30:55 PDT 2012


On 8/1/2012 11:49 PM, Jacob Carlborg wrote:
> On 2012-08-02 02:10, Walter Bright wrote:
>
>> 1. It should accept as input an input range of UTF8. I feel it is a
>> mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
>> UTF16 should use an 'adapter' range to convert the input to UTF8. (This
>> is what component programming is all about.)
>
> I'm no expert on ranges but won't that prevent slicing? Slicing is one of the
> main reasons for why the Tango XML parser is so amazingly fast.
>

You don't want to use slicing on the lexer. The reason is that your slices will 
be spread all over memory, as source files can be huge, and all that memory will 
be retained and never released. What you want is a compact representation after 
lexing. Compactness also helps a lot with memory caching.



More information about the Digitalmars-d mailing list