std.string.toUpper() for greek characters
Ali Çehreli
acehreli at yahoo.com
Wed Oct 3 14:23:35 PDT 2012
On 10/03/2012 01:37 PM, Dmitry Olshansky wrote:
> On 03-Oct-12 23:56, Ali Çehreli wrote:
> If we are talking about the order then this is the way to go:
> http://unicode.org/reports/tr10/
Thank you. I wasn't aware of that long read. :)
>> struct Order
>> {
>> int base;
>> int accent;
>> int cased;
>> }
>>
>> (Of course opCmp() cannot return that type. :( )
>>
>> The idea is that only the application knows what type of comparison
>> makes sense.
>
> So instead library does all of them ? Ouch.. I'm not sure I got the idea.
The idea was that there would be AlphabetChar and AlphabetString that
knew about what writing system that they belonged to: AlphabetChar!en,
AlphabetChar!tr, etc.
For example, while letter ç is a distinct letter in the Turkish
alphabet, it is an accented form of c in most Latin-based alphabets.
That affects the 'base' member above. On the other hand, â is an
accented 'a' both in the Turkish and the Latin-based alphabets. So the
'base' comparison for â and a would be the same.
Collation takes the alphabet into account. Although AlphabetChar!en is
not compatible with AlphabetChar!tr, they can be forced to be compared
according to the collation information of any alphabet.
So, that experimental library provides a number of alphabets with their
own collation orders. I see now that the library should have supported
the Unicode document that you have linked above. I will do some reading. :)
Ali
More information about the Digitalmars-d-announce
mailing list