VLERange: a range in between BidirectionalRange and RandomAccessRange

spir denis.spir at gmail.com
Tue Jan 11 18:03:45 PST 2011


On 01/12/2011 02:22 AM, Andrei Alexandrescu wrote:
>> IIUC, for the case of text, VLERange helps abstracting from the annoying
>> fact that a codepoint is encoded as a variable number of code units.
>> What I meant is issues like:
>>
>> auto text = "a\u0302"d;
>> writeln(text); // "â"
>> auto range = VLERange(text);
>> // extracts characters correctly?
>> auto letter = range.front(); // "a" or "â"?
>> // case yes: compares correctly?
>> assert(range.front() == "â"); // fail or pass?
>
> You should try text.front right now, you might be surprised :o).

Hum, right now incorrectly returns "a" as expected. And indeed
	assert ("â" == "a\u0302");
incorrectly fails as expected.
Both would work with legacy charsets like latin-1. This is a new issue 
introduced with UCS, that requires an additional level of abstraction 
(in addition to the one required by the distincton codepoint/codeunit!)

You may have a look at 
https://bitbucket.org/denispir/denispir-d/src/5ec6fe1e1065/Text.html for 
a rough implementation of a type that does the right thing, & at 
https://bitbucket.org/denispir/denispir-d/src/5ec6fe1e1065/U%20missing%20level%20of%20abstraction 
for a (far too long) explanation.
(I have tried to mention those problems a dozen times already, but for 
any reason nearly everybody seem definitely deaf in front of them.)


Denis
_________________
vita es estrany
spir.wikidot.com



More information about the Digitalmars-d mailing list