Sorting with non-ASCII characters
Ali Çehreli
acehreli at yahoo.com
Thu Sep 19 09:44:37 PDT 2013
On 09/19/2013 08:18 AM, Chris wrote:
> Short question in case anyone knows the answer straight away:
>
> How do I sort text so that non-ascii characters like "á" are treated in
> the same way as "a"?
>
> Now I'm getting this:
>
> [wow, ara, ába, marca]
>
> ===> sort(listAbove);
>
> [ara, marca, wow, ába]
>
> I'd like to get:
>
> [ ába, ara, marca, wow]
>
> Thanks.
>
>
>
I have a project that tries to do exactly that:
https://code.google.com/p/trileri/source/browse/trunk/tr/dizgi.d#823
However, it is in Turkish and in need of a rewrite. :/
For the whole thing to work, every character must be of a certain
alphabet. Here is the English alphabet:
https://code.google.com/p/trileri/source/browse/trunk/tr/alfabe.d#747
Here is how I define e.g. á to be an accented version of a:
https://code.google.com/p/trileri/source/browse/trunk/tr/harfler.d#23
However, some characters stand individually as they are not accents but
proper letters themselves (e.g. ç of the Turkish alphabet):
https://code.google.com/p/trileri/source/browse/trunk/tr/harfler.d#44
Well... I hope to get back to it at some point, taking advantage of the
new std.uni as well.
Ali
More information about the Digitalmars-d-learn
mailing list