std.string.toUpper() for greek characters

Ali Çehreli acehreli at yahoo.com
Wed Oct 3 10:10:36 PDT 2012


On 10/03/2012 03:56 AM, Minas wrote:
> Currently, toUpper() (and probably toLower()) does not handle greek
> characters correctly. I fixed toUpper() by making a another function for
> greek characters
>
> // called if (c >= 0x387 && c <= 0x3CE)
> dchar toUpperGreek(dchar c)
> {
> if( c >= 'α' && c <= 'ω' )
> {
> if( c == 'ς' )
> c = 'Σ';
> else
> c -= 32;
> }
> else
> {
> dchar[dchar] map;
> map['ά'] = 'Ά';
> map['έ'] = 'Έ';
> map['ή'] = 'Ή';
> map['ί'] = 'Ί';
> map['ϊ'] = 'Ϊ';
> map['ΐ'] = 'Ϊ';
> map['ό'] = 'Ό';
> map['ύ'] = 'Ύ';
> map['ϋ'] = 'Ϋ';
> map['ΰ'] = 'Ϋ';
> map['ώ'] = 'Ώ';
>
> c = map[c];
> }
>
> return c;
> }
>
> Then, in toUpper()
> {
> ....
> if (c >= 0x387 && c <= 0x3CE)
> c = toUpperGreek()...
> ///
> }
>
> Do you think it should stay like that or I should copy-paste it in the
> body of toUpper()?
>
> I'm going to fix toLower() as well and make a pull request.

I don't want to detract from the usefulness of these functions but 
toupper and tolower has been two of the strangests functions of the 
computer history. It is amazing that they are still accepted, because 
they are useful in very limited situations and those situations are 
becoming rarer as more and more systems support Unicode.

Two quick examples:

1) How should this string be capitalized in a scientific article?

   "Anti-obesity effects of α-lipoic acid"

I don't think the α in there should be upper-cased.

2) How should this name be capitalized in a list of names?

   "Ali"

It completely depends on the writing system of that string itself, not 
even the current locale. (There are two uppercases that I know of, which 
can be considered as correct: "ALI" and "ALİ".)

I agree that your toUpper() and toLower() will be useful in many 
contexts but will necessarily do the wrong thing in others.

Ali


More information about the Digitalmars-d-announce mailing list