Turkish 'I's can't D either

Ali Cehreli acehreli at yahoo.com
Tue Aug 25 01:11:59 PDT 2009


Walter Bright Wrote:

> with details of the Turkish language because I 
> have no idea how it works.

It is a very interesting story. The Turkish 'i's have caused lots of trouble, even hardcoded conditionals in at least the early Java libraries that checked whether the locale was Turkish.

Even the Unicode is in a strange position because two Unicode code points have two separate upper and lower cases. (I don't know whether there are other alphabets in such a situation.)

> tolower really is only for ASCII. But the toUniLower should work right 
> with Turkish, though I don't know what right is for that case.

The current implementation of toUniLower() favors the ASCII lowercasing of 'I' over the Turkish one (similar with toUniUpper() for i):

dchar toUniLower(dchar c)
{
    if (c >= 'A' && c <= 'Z')
    {
        c += 32;
    }

An application would need a separate set of toUniLower() and friends to be able to work in Turkish.

I don't think the issue is big enough for Phobos to tackle with a solution similar to CaseSensitive:

   toUniLower('I', (Alphabet).tr);

Instead, a wrapper around toUniLower() should be used...

Ali




More information about the Digitalmars-d mailing list