VLERange: a range in between BidirectionalRange and RandomAccessRange

Nick Sabalausky a at a.a
Thu Jan 13 23:00:02 PST 2011


"Nick Sabalausky" <a at a.a> wrote in message 
news:igori7$1ovh$1 at digitalmars.com...
> "Andrei Alexandrescu" <SeeWebsiteForEmail at erdani.org> wrote in message 
> news:igoqrm$1n5r$1 at digitalmars.com...
>> On 1/13/11 10:26 PM, Nick Sabalausky wrote:
>> [snip]
>>> [ 'f', {u with the umlaut}, 'n', 'f' ]
>>>
>>> Or:
>>>
>>> [ 'f', 'u', {umlaut combining character}, 'n', 'f' ]
>>>
>>> Those *both* get rendered exactly the same, and both represent the same
>>> four-letter sequence. In the second example, the 'u' and the {umlaut
>>> combining character} combine to form one grapheme. The f's and n's just
>>> happen to be single-code-point graphemes.
>>>
>>> Note that while some characters exist in pre-combined form (such as the 
>>> {u
>>> with the umlaut} above), legend has it there are others than can only be
>>> represented using a combining character.
>>>
>>> It's also my understanding, though I'm not certain, that sometimes 
>>> multiple
>>> combining characters can be used together on the same "root" character.
>>
>> Thanks. One further question is: in the above example with u-with-umlaut, 
>> there is one code point that corresponds to the entire combination. Are 
>> there combinations that do not have a unique code point?
>>
>
> My understanding is "yes". At least that's what I've heard, and I've never 
> heard any claims of "no". I don't know of any specific ones offhand, 
> though. Actually, it might be possible to use any combining character with 
> any old letter or number (like maybe a 7 with an umlaut), though I'm not 
> certain.
>
> FWIW, the Wikipedia article might help, or at least link to other things 
> that might help: http://en.wikipedia.org/wiki/Combining_character
>
> Michel or spir might have better links though.
>

Heh, as if that wasn't bad enough, there's also digraphs which, from what I 
can tell, seem to be single code-points that represent more than one 
glyph/character/grapheme:

http://en.wikipedia.org/wiki/Digraph_(orthography)#Digraphs_in_Unicode

This page may be helpful too:
http://en.wikipedia.org/wiki/Precomposed_character





More information about the Digitalmars-d mailing list