Timsort vs some others

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Dec 18 07:55:17 PST 2012


On 12/18/12 5:50 AM, Peter Alexander wrote:
> On Tuesday, 18 December 2012 at 06:52:27 UTC, Xinok wrote:
>> On another note, I highly doubt that std::sort uses a "median of
>> medians" algorithm, which would add much overhead and essentially
>> double the number of comparisons required with little to no benefit.
>> More likely, it simply chooses the pivot from a median of three.
>
> Different implementations use different strategies.
>
> libstdc++ seems to use median of 3.
>
> The Dinkumware standard library (which ships with MSVC) uses median of 9.

We should use a deferred pivot algorithm that I thought of a long time 
ago but never got around to test.

One thing about current pivot selection approaches is that they decide 
on a strategy (middle, median of 3, median of 9, random etc) _before_ 
ever looking at the data.

A different approach would be to take a look at a bit of data and _then_ 
decide what the pivot is.

Consider the following approach:

size_t partition(T[] d)
{
     assert(a.length);
     size_t a = 0, z = arr.length - 1;
     auto maxa = a, minz = z;

     for (; a < z && mina <= maxz;)
     {
         if (d[a] > d[z])
         {
             swap(d[a], d[z]);
         }
         if (d[maxa] < d[++a]) maxa = a;
         if (d[minz] > d[--z]) minz = z;
     }
     --a;
     ++z;

/* Here data is already partitioned wrt d[a] or d[z]. If enough progress 
has been made (i.e. a is large enough compared to d.length), choose one 
of these as the pivot and only partition d[a .. z + 1]. Otherwise, use a 
classic pivot choice criterion.
*/
     ...
}

Another approach I wanted to try was to choose the median of K with K 
increasing with the size of the array. This is because a good pivot is 
more important for large arrays than for small arrays. As such, a 
possibility would be to simply sort a stride of the input 
(std.range.stride is random access and can be sorted right away without 
any particular implementation effort) and then choose the middle of the 
stride as the pivot.

If anyone has the time and inclination, have at it!


Andrei


More information about the Digitalmars-d mailing list