VLERange: a range in between BidirectionalRange and RandomAccessRange

spir denis.spir at gmail.com
Mon Jan 17 08:28:45 PST 2011


On 01/14/2011 04:50 PM, Michel Fortin wrote:
>> This might be a good time to see whether we need to address graphemes
>> systematically. Could you please post a few links that would educate
>> me and others in the mysteries of combining characters?
>
> As usual, Wikipedia offers a good summary and a couple of references.
> Here's the part about combining characters:
> <http://en.wikipedia.org/wiki/Combining_character>.
>
> There's basically four ranges of code points which are combining:
> - Combining Diacritical Marks (0300–036F)
> - Combining Diacritical Marks Supplement (1DC0–1DFF)
> - Combining Diacritical Marks for Symbols (20D0–20FF)
> - Combining Half Marks (FE20–FE2F)
>
> A code point followed by one or more code points in these ranges is
> conceptually a single character (a grapheme).

Unfortunatly, things are complicated by _prepend_ combining marks that 
happen in a code sequence _before_ the base mark.
The Unicode algorithm is described here: 
http://unicode.org/reports/tr29/ section 3 (humanly readable ;-). See 
esp the first table in section 3.1.

Denis
_________________
vita es estrany
spir.wikidot.com



More information about the Digitalmars-d mailing list