VLERange: a range in between BidirectionalRange and RandomAccessRange

Ali Çehreli acehreli at yahoo.com
Mon Jan 17 21:11:00 PST 2011


Thanks to all that has contributed, I am also following this thread with 
great interest. :)

Michel Fortin wrote:
 > I mean, a grapheme is a slice of a string, can have multiple code points
 > (like a string), can be appended the same way as a string, can be
 > composed or decomposed using canonical normalization or compatibility
 > normalization (like a string), and should be sorted, uppercased, and
 > lowercased according to Unicode rules (like a string). Basically, a
 > grapheme is just a string that happens to contain only one grapheme.

I would like to stress the fact that Unicode knows nothing about 
sorting, uppercasing, or lowercasing.

Those operations are tied to the alphabet (or writing system) that a 
certain grapheme happens to belong to at a given time. For example, we 
cannot uppercase the letter i without knowing what alphabet we are 
dealing with. Two possibilities: I and İ (I dot above).

It is the same issue with sorting.

Ali


More information about the Digitalmars-d mailing list