VLERange: a range in between BidirectionalRange and RandomAccessRange

spir denis.spir at gmail.com
Thu Jan 13 02:46:23 PST 2011


On 01/13/2011 01:51 AM, Michel Fortin wrote:
> On 2011-01-12 19:45:36 -0500, Michel Fortin <michel.fortin at michelf.com>
> said:
>
>> A funny exercise to make a fool of an algorithm working only with code
>> points would be to replace the word "fortune" in a text containing the
>> word "fortuné". If the last "é" is expressed as two code points, as
>> "e" followed by a combining acute accent (this: é), replacing
>> occurrences of "fortune" by "expose" would also replace "fortuné" with
>> "exposé" because the combining acute accent remains as the code point
>> following the word. Quite amusing, but it doesn't really make sense
>> that it works like that.
>>
>> In the case of "é", we're lucky enough to also have a pre-combined
>> character to encode it as a single code point, so encountering "é"
>> written as two code points is quite rare. But not all combinations of
>> marks and characters can be represented as a single code point. The
>> correct thing to do is to treat "é" (single code point) and "é" ("e" +
>> combining acute accent) as equivalent.
>
> Crap, I meant to send this as UTF-8 with combining characters in it, but
> my news client converted everything to ISO-8859-1.
>
> I'm not sure it'll work, but here's my second attempt at posting real
> combining marks:
>
> Single code point: é
> e with combining mark: é
> t with combining mark: t̂
> t with two combining marks: t̂̃

Works :-) But your first post worked as well by me: for instance <<"é" 
("e" + combining acute accent)>> was displayed "é" as a single accented 
letter. I guess maybe your email client did not convert into iso-8859-1 
on sending, but on reading (mine is set for utf-8).

Denis
_________________
vita es estrany
spir.wikidot.com



More information about the Digitalmars-d mailing list