Questions about Unicode, particularly Japanese

bearophile bearophileHUGS at lycos.com
Tue Jun 8 12:40:14 PDT 2010


Nick Sabalausky:

> 3. A text editor, for instance, is intended to treat something like (U+305D, 
> U+3099) as a single character, right?

Languages are a product of biology, and in biology it's usually hard to put absolute limits between things; all definitions must be flexible and a little fuzzy if they want to grasp enough of the reality and be useful. So I think the answer to this question is positive.
When you iterate with D foreach on a string that contains those, what is the right way to split chars? Returning a single "char" 8 bytes long (that is a string of two 32-bit chars) that contains them both is not wrong (but probably not expected) :-)

Bye,
bearophile


More information about the Digitalmars-d mailing list