std.string.toUpper() for greek characters

Ali Çehreli acehreli at yahoo.com
Wed Oct 3 14:23:35 PDT 2012


On 10/03/2012 01:37 PM, Dmitry Olshansky wrote:
 > On 03-Oct-12 23:56, Ali Çehreli wrote:

 > If we are talking about the order then this is the way to go:
 > http://unicode.org/reports/tr10/

Thank you. I wasn't aware of that long read. :)

 >> struct Order
 >> {
 >> int base;
 >> int accent;
 >> int cased;
 >> }
 >>
 >> (Of course opCmp() cannot return that type. :( )
 >>
 >> The idea is that only the application knows what type of comparison
 >> makes sense.
 >
 > So instead library does all of them ? Ouch.. I'm not sure I got the idea.

The idea was that there would be AlphabetChar and AlphabetString that 
knew about what writing system that they belonged to: AlphabetChar!en, 
AlphabetChar!tr, etc.

For example, while letter ç is a distinct letter in the Turkish 
alphabet, it is an accented form of c in most Latin-based alphabets. 
That affects the 'base' member above. On the other hand, â is an 
accented 'a' both in the Turkish and the Latin-based alphabets. So the 
'base' comparison for â and a would be the same.

Collation takes the alphabet into account. Although AlphabetChar!en is 
not compatible with AlphabetChar!tr, they can be forced to be compared 
according to the collation information of any alphabet.

So, that experimental library provides a number of alphabets with their 
own collation orders. I see now that the library should have supported 
the Unicode document that you have linked above. I will do some reading. :)

Ali



More information about the Digitalmars-d-announce mailing list