VLERange: a range in between BidirectionalRange and RandomAccessRange

Michel Fortin michel.fortin at michelf.com
Wed Jan 12 16:51:52 PST 2011


On 2011-01-12 19:45:36 -0500, Michel Fortin <michel.fortin at michelf.com> said:

> A funny exercise to make a fool of an algorithm working only with code 
> points would be to replace the word "fortune" in a text containing the 
> word "fortuné". If the last "é" is expressed as two code points, as "e" 
> followed by a combining acute accent (this: é), replacing occurrences 
> of "fortune" by "expose" would also replace "fortuné" with "exposé" 
> because the combining acute accent remains as the code point following 
> the word. Quite amusing, but it doesn't really make sense that it works 
> like that.
> 
> In the case of "é", we're lucky enough to also have a pre-combined 
> character to encode it as a single code point, so encountering "é" 
> written as two code points is quite rare. But not all combinations of 
> marks and characters can be represented as a single code point. The 
> correct thing to do is to treat "é" (single code point) and "é" ("e" + 
> combining acute accent) as equivalent.

Crap, I meant to send this as UTF-8 with combining characters in it, 
but my news client converted everything to ISO-8859-1.

I'm not sure it'll work, but here's my second attempt at posting real 
combining marks:

	Single code point: é
	e with combining mark: é
	t with combining mark: t̂
	t with two combining marks: t̂̃

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



More information about the Digitalmars-d mailing list