VLERange: a range in between BidirectionalRange and RandomAccessRange

Mon Jan 17 23:58:02 PST 2011

On 18/01/11 16:46, Andrei Alexandrescu wrote:
> On 1/17/11 9:48 PM, Michel Fortin wrote:
>> On 2011-01-17 17:54:04 -0500, Michel Fortin <michel.fortin at michelf.com>
>> said:
>>
>>> More seriously, you have four choice:
>>>
>>> 1. code unit
>>> 2. code point
>>> 3. grapheme
>>> 4. require the client to state explicitly which kind of 'character' he
>>> wants; 'character' being an overloaded word, it's reasonable to ask
>>> for disambiguation.
>>
>> This makes me think of what I did with my XML parser after you made code
>> points the element type for strings. Basically, the parser now uses
>> 'front' and 'popFront' whenever it needs to get the next code point, but
>> most of the time it uses 'frontUnit' and 'popFrontUnit' instead (which I
>> had to add) when testing for or skipping an ASCII character is
>> sufficient. This way I avoid a lot of unnecessary decoding of code
>> points.
>>
>> For this to work, the same range must let you skip either a unit or a
>> code point. If I were using a separate range with a call to toDchar or
>> toCodeUnit (or toGrapheme if I needed to check graphemes), it wouldn't
>> have helped much because the new range would essentially become a new
>> slice independent of the original, so you can't interleave "I want to
>> advance by one unit" with "I want to advance by one code point".
>>
>> So perhaps the best interface for strings would be to provide multiple
>> range-like interfaces that you can use at the level you want.
>>
>> I'm not sure if this is a good idea, but I thought I should at least
>> share my experience.
>
> Very insightful. Thanks for sharing. Code it up and make a solid proposal!
>
> Andrei

How does this differ from Steve Schveighoffer's string_t, subtract the 
indexing and slicing of code-points, plus a bidirectional grapheme range?