std.d.lexer requirements

Thu Aug 2 04:52:15 PDT 2012

Le 02/08/2012 09:30, Walter Bright a écrit :
> On 8/1/2012 11:49 PM, Jacob Carlborg wrote:
>> On 2012-08-02 02:10, Walter Bright wrote:
>>
>>> 1. It should accept as input an input range of UTF8. I feel it is a
>>> mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
>>> UTF16 should use an 'adapter' range to convert the input to UTF8. (This
>>> is what component programming is all about.)
>>
>> I'm no expert on ranges but won't that prevent slicing? Slicing is one
>> of the
>> main reasons for why the Tango XML parser is so amazingly fast.
>>
>
> You don't want to use slicing on the lexer. The reason is that your
> slices will be spread all over memory, as source files can be huge, and
> all that memory will be retained and never released. What you want is a
> compact representation after lexing. Compactness also helps a lot with
> memory caching.
>

Token are not kept in memory. You usually consume them for other 
processing and throw them away.

It isn't an issue.