Turkish 'I's can't D either
Daniel Keep
daniel.keep.lists at gmail.com
Tue Aug 25 02:30:41 PDT 2009
Ali Cehreli wrote:
> Walter Bright Wrote:
>
>> with details of the Turkish language because I
>> have no idea how it works.
>
> It is a very interesting story. The Turkish 'i's have caused lots of trouble, even hardcoded conditionals in at least the early Java libraries that checked whether the locale was Turkish.
>
> Even the Unicode is in a strange position because two Unicode code points have two separate upper and lower cases. (I don't know whether there are other alphabets in such a situation.)
>
>> tolower really is only for ASCII. But the toUniLower should work right
>> with Turkish, though I don't know what right is for that case.
>
> The current implementation of toUniLower() favors the ASCII lowercasing of 'I' over the Turkish one (similar with toUniUpper() for i):
>
> dchar toUniLower(dchar c)
> {
> if (c >= 'A' && c <= 'Z')
> {
> c += 32;
> }
>
> An application would need a separate set of toUniLower() and friends to be able to work in Turkish.
>
> I don't think the issue is big enough for Phobos to tackle with a solution similar to CaseSensitive:
>
> toUniLower('I', (Alphabet).tr);
>
> Instead, a wrapper around toUniLower() should be used...
>
> Ali
To me, it seems that the issue is that the library routines don't have
enough context to be able to correctly work out how to lowercase a string.
Having it locale-dependant seems like a bad idea; let's say I'm
processing some internal data that uses string names; case is irrelevant
so I lowercase them and look them up in a hash table.
If there's an I in there, but the hashtable is stored as i, the program
will break if run in a Turkish locale.
One thing I think the typesystem should be used more for is attaching
more semantic information to data. So maybe the solution is to
introduce something like a Text type that also stores the language of
the text. Then the library methods WILL have the right context to know
how to act.
Just a thought.
More information about the Digitalmars-d
mailing list