Today's programming challenge - How's your Range-Fu ?
via Digitalmars-d
digitalmars-d at puremagic.com
Sun Apr 19 02:54:58 PDT 2015
On Sunday, 19 April 2015 at 02:20:01 UTC, Shachar Shemesh wrote:
> U0065+U0301 rather than U00e9. Because of legacy systems, and
> because they would rather have the ISO-8509 code pages be 1:1
> mappings, rather than 1:n mappings, they introduced code points
> they really would rather do without.
That's probably right. It is in fact a major feat to have the
world adopt a new standard wholesale, but there are also
difficult "semiotic" issues when you encode symbols and different
languages view symbols differently (e.g. is "ä" an "a" or do you
have two unique letters in the alphabet?)
Take "å", it can represent a unit (ångström) or a letter with a
circle above it, or a unique letter in the alphabet. The letter
"æ" can be seen as a combination of "ae" or a unique letter.
And we can expect languages, signs and practices to evolve over
time too. How can you normalize encodings without normalizing
writing practice and natural language development? That would be
beyond the mandate of a unicode standard organization...
More information about the Digitalmars-d
mailing list