VLERange: a range in between BidirectionalRange and RandomAccessRange

spir denis.spir at gmail.com
Fri Jan 14 05:48:40 PST 2011


On 01/14/2011 07:44 AM, Nick Sabalausky wrote:
> "Andrei Alexandrescu"<SeeWebsiteForEmail at erdani.org>  wrote in message
> news:igoqrm$1n5r$1 at digitalmars.com...
>> On 1/13/11 10:26 PM, Nick Sabalausky wrote:
>> [snip]
>>> [ 'f', {u with the umlaut}, 'n', 'f' ]
>>>
>>> Or:
>>>
>>> [ 'f', 'u', {umlaut combining character}, 'n', 'f' ]
>>>
>>> Those *both* get rendered exactly the same, and both represent the same
>>> four-letter sequence. In the second example, the 'u' and the {umlaut
>>> combining character} combine to form one grapheme. The f's and n's just
>>> happen to be single-code-point graphemes.
>>>
>>> Note that while some characters exist in pre-combined form (such as the
>>> {u
>>> with the umlaut} above), legend has it there are others than can only be
>>> represented using a combining character.
>>>
>>> It's also my understanding, though I'm not certain, that sometimes
>>> multiple
>>> combining characters can be used together on the same "root" character.
>>
>> Thanks. One further question is: in the above example with u-with-umlaut,
>> there is one code point that corresponds to the entire combination. Are
>> there combinations that do not have a unique code point?
>>
>
> My understanding is "yes". At least that's what I've heard, and I've never
> heard any claims of "no". I don't know of any specific ones offhand, though.
> Actually, it might be possible to use any combining character with any old
> letter or number (like maybe a 7 with an umlaut), though I'm not certain.

The problem is then whether a font knows how to display it. My usual 
fonts (DejaVu series, pretty good with Unicode) show:
	7̈
meaning they do not know how to combine digits with diacritics (they do 
it well with other rather strange combinations.)

But: one of the relevant advantages of decomposed forms is that when 
they don't know the character, they can still show at least the 
component marks, here '7' & '~'. Which is better than nothing for a user 
who knows the scripting system. If I try to display for instance a 
_precomposed_ syllable from a language my font does not know, i will get 
instead either a little square with the codepoint written inside in 
minuscules digits, or a placeholder like inversed-video "?".


denis
_________________
vita es estrany
spir.wikidot.com



More information about the Digitalmars-d mailing list