VLERange: a range in between BidirectionalRange and RandomAccessRange

Michel Fortin michel.fortin at michelf.com
Tue Jan 11 11:13:45 PST 2011


On 2011-01-11 11:36:54 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> On 1/11/11 4:41 AM, Michel Fortin wrote:
>> For instance, say we have a conversion range taking a Unicode string and
>> converting it to ISO Latin 1. The best (lossy) conversion for "œ" is
>> "oe" (one chararacter to two characters), in this case 'front' could
>> simply return "oe" (two characters) in one iteration, with stepSize
>> being the size of the "œ" code point. In the same conversion process,
>> encountering "e" followed by a combining "´" would return pre-combined
>> character "é" (two characters to one character).
> 
> In the design as I thought of it, the effective length of one logical 
> element is one or more representation units. My understanding is that 
> you are referring to a fractional number of representation units for 
> one logical element.

Your understanding is correct.

I think both cases (one becomes many & many becomes one) are important 
and must be supported. Your proposal only deal with the 
many-becomes-one case.

I proposed returning arrays so we can deal with the one-becomes-many 
case ("œ" becoming "oe"). Another idea would be to introduce 
"substeps". When checking for the next character, in addition to 
determining its step length you could also determine the number of 
substeps in it. "œ" would have two substeps, "o" and "e", and when 
there is no longer any substep you move to the next step.

All this said, I think this should stay an implementation detail as 
this would allow a variety of strategies. Also, keeping this an 
implementation detail means that your proposed 'stepSize' and 
'backstepSize' need to be an implementation detail too (because they 
won't make sense for the one-to-many case). So they can't really be 
part of a standard VLE interface.

As far as I know, all we really need to expose to algorithms is whether 
a range has elements of variable length, because this has an impact on 
your indexing capabilities. The rest seems unnecessary to me, or am I 
missing some use cases?

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



More information about the Digitalmars-d mailing list