Turkish 'I's can't D either

Daniel Keep daniel.keep.lists at gmail.com
Tue Aug 25 02:30:41 PDT 2009


Ali Cehreli wrote:
> Walter Bright Wrote:
> 
>> with details of the Turkish language because I 
>> have no idea how it works.
> 
> It is a very interesting story. The Turkish 'i's have caused lots of trouble, even hardcoded conditionals in at least the early Java libraries that checked whether the locale was Turkish.
> 
> Even the Unicode is in a strange position because two Unicode code points have two separate upper and lower cases. (I don't know whether there are other alphabets in such a situation.)
> 
>> tolower really is only for ASCII. But the toUniLower should work right 
>> with Turkish, though I don't know what right is for that case.
> 
> The current implementation of toUniLower() favors the ASCII lowercasing of 'I' over the Turkish one (similar with toUniUpper() for i):
> 
> dchar toUniLower(dchar c)
> {
>     if (c >= 'A' && c <= 'Z')
>     {
>         c += 32;
>     }
> 
> An application would need a separate set of toUniLower() and friends to be able to work in Turkish.
> 
> I don't think the issue is big enough for Phobos to tackle with a solution similar to CaseSensitive:
> 
>    toUniLower('I', (Alphabet).tr);
> 
> Instead, a wrapper around toUniLower() should be used...
> 
> Ali

To me, it seems that the issue is that the library routines don't have
enough context to be able to correctly work out how to lowercase a string.

Having it locale-dependant seems like a bad idea; let's say I'm
processing some internal data that uses string names; case is irrelevant
so I lowercase them and look them up in a hash table.

If there's an I in there, but the hashtable is stored as i, the program
will break if run in a Turkish locale.

One thing I think the typesystem should be used more for is attaching
more semantic information to data.  So maybe the solution is to
introduce something like a Text type that also stores the language of
the text.  Then the library methods WILL have the right context to know
how to act.

Just a thought.



More information about the Digitalmars-d mailing list