Sorting with non-ASCII characters

Jos van Uden usenet at fwend.com
Tue Sep 24 03:35:49 PDT 2013


On 24-9-2013 11:26, Chris wrote:
> On Thursday, 19 September 2013 at 18:44:54 UTC, Jos van Uden wrote:
>> On 19-9-2013 17:18, Chris wrote:
>>> Short question in case anyone knows the answer straight away:
>>>
>>> How do I sort text so that non-ascii characters like "á" are treated in the same way as "a"?
>>>
>>> Now I'm getting this:
>>>
>>> [wow, ara, ába, marca]
>>>
>>> ===> sort(listAbove);
>>>
>>> [ara, marca, wow, ába]
>>>
>>> I'd like to get:
>>>
>>> [ ába, ara, marca, wow]
>>
>> If you only need to process extended ascii, then you could perhaps
>> make do with a transliterated sort, something like:
>>
>> import std.stdio, std.string, std.algorithm, std.uni;
>>
>> void main() {
>>     auto sa = ["wow", "ara", "ába", "Marca"];
>>     writeln(sa);
>>     trSort(sa);
>>     writeln(sa);
>> }
>>
>> void trSort(C, alias less = "a < b")(C[] arr) {
>>     static dstring c1 = "àáâãäåçèéêëìíîïñòóôõöøùúûüýÿ";
>>     static dstring c2 = "aaaaaaceeeeiiiinoooooouuuuyy";
>>     schwartzSort!(a => tr(toLower(a), c1, c2), less)(arr);
>> }
>
> Thanks a million, Jos! This does the trick for me.

Great.

Be aware that the above code does a case insensitive sort, if you need
case sensitive, you can use something like:


import std.stdio, std.string, std.algorithm, std.uni;

void main() {
     auto sa = ["wow", "ara", "ába", "Marca"];
     writeln(sa);
     trSort(sa, CaseSensitive.no);
     writeln(sa);
     
     writeln;
     
     sa = ["wow", "ara", "ába", "Marca"];
     writeln(sa);
     trSort(sa, CaseSensitive.yes);
     writeln(sa);
}

void trSort(C, alias less = "a < b")(C[] arr,
                             CaseSensitive cs = CaseSensitive.yes) {
                             
     static c1 = "àáâãäåçèéêëìíîïñòóôõöøùúûüýÿÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸ"d;
     static c2 = "aaaaaaceeeeiiiinoooooouuuuyyAAAAAACEEEEIIIINOOOOOOUUUUYY"d;
     
     if (cs == CaseSensitive.no)
         arr.schwartzSort!(a => a.toLower.tr(c1, c2), less);
     else
         arr.schwartzSort!(a => a.tr(c1, c2), less);
}


More information about the Digitalmars-d-learn mailing list