VLERange: a range in between BidirectionalRange and RandomAccessRange
Michel Fortin
michel.fortin at michelf.com
Wed Jan 12 16:51:52 PST 2011
On 2011-01-12 19:45:36 -0500, Michel Fortin <michel.fortin at michelf.com> said:
> A funny exercise to make a fool of an algorithm working only with code
> points would be to replace the word "fortune" in a text containing the
> word "fortuné". If the last "é" is expressed as two code points, as "e"
> followed by a combining acute accent (this: é), replacing occurrences
> of "fortune" by "expose" would also replace "fortuné" with "exposé"
> because the combining acute accent remains as the code point following
> the word. Quite amusing, but it doesn't really make sense that it works
> like that.
>
> In the case of "é", we're lucky enough to also have a pre-combined
> character to encode it as a single code point, so encountering "é"
> written as two code points is quite rare. But not all combinations of
> marks and characters can be represented as a single code point. The
> correct thing to do is to treat "é" (single code point) and "é" ("e" +
> combining acute accent) as equivalent.
Crap, I meant to send this as UTF-8 with combining characters in it,
but my news client converted everything to ISO-8859-1.
I'm not sure it'll work, but here's my second attempt at posting real
combining marks:
Single code point: é
e with combining mark: é
t with combining mark: t̂
t with two combining marks: t̂̃
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list