Sorting with non-ASCII characters

Ali Çehreli acehreli at yahoo.com
Thu Sep 19 09:44:37 PDT 2013


On 09/19/2013 08:18 AM, Chris wrote:
> Short question in case anyone knows the answer straight away:
>
> How do I sort text so that non-ascii characters like "á" are treated in
> the same way as "a"?
>
> Now I'm getting this:
>
> [wow, ara, ába, marca]
>
> ===> sort(listAbove);
>
> [ara, marca, wow, ába]
>
> I'd like to get:
>
> [ ába, ara, marca, wow]
>
> Thanks.
>
>
>

I have a project that tries to do exactly that:

   https://code.google.com/p/trileri/source/browse/trunk/tr/dizgi.d#823

However, it is in Turkish and in need of a rewrite. :/

For the whole thing to work, every character must be of a certain 
alphabet. Here is the English alphabet:

   https://code.google.com/p/trileri/source/browse/trunk/tr/alfabe.d#747

Here is how I define e.g. á to be an accented version of a:

   https://code.google.com/p/trileri/source/browse/trunk/tr/harfler.d#23

However, some characters stand individually as they are not accents but 
proper letters themselves (e.g. ç of the Turkish alphabet):

   https://code.google.com/p/trileri/source/browse/trunk/tr/harfler.d#44

Well... I hope to get back to it at some point, taking advantage of the 
new std.uni as well.

Ali



More information about the Digitalmars-d-learn mailing list