Worst-case performance of quickSort / getPivot

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Sat Nov 16 16:56:46 PST 2013


On 11/16/13 4:18 PM, Chris Cain wrote:
> I think it's more complicated than that. Let's assume for a moment that
> you've proven that such an unstable sort must exist that is faster (I'm
> not convinced that it necessarily must take extra work to maintain
> stability).

Very simple. Any sequence with a large number of equivalent elements 
will cause timsort to work more than an unstable algorithm, because it 
would need to shift around those elements. (Consider e.g. 1, 0, 0, 0, 
..., 0.) The discussion then shifts to how many equivalent elements are 
in the typical range etc.

> You have not shown how much faster it might be (it could be
> only 1% faster) nor how much work it would take to discover (even an
> ideal pivot choice for quicksort actually cannot be as fast as Timsort
> on an already sorted sequence, and quicksort is not an appropriate
> algorithm for accepting presorted subsequences).

There are well known tweaks to quicksort that make it operate well on 
sorted or almost sorted sequences.

> If it takes 2 years to
> come up with an unstable sort faster than Timsort, then it seems like a
> long time to be using something that isn't the best that we currently
> have. Especially if D is to be used in benchmarks against other languages.
>
> Timsort is implemented now and has no real cost to use as default. If,
> at some point, an unstable sort is discovered that is faster on average
> than Timsort AND (extremely important) it has no potential for an
> attacker to be able to choose a sequence to sort that causes poor
> performance, then unstable could be switched back as the default at some
> point.

We could do that, but that would be masking a bug instead of fixing it. 
We should instead focus on fixing unstable sort. One additional issue 
with making stable sort the default for a while is that people will grow 
to assume sort() is stable, and will have broken code when we switch back.

Let's fix unstable sort. If it's slower on any other data than Timsort's 
happy case, we have a bug.


Andrei



More information about the Digitalmars-d mailing list