fastSort benchmarking (was Re: Work done so far)

Matti Niemenmaa see_signature at for.real.address
Mon Jan 21 00:42:45 PST 2008


bearophile wrote:
> Walter>If you have a faster qsort, and want to contribute it, please do so!
> Same goes for other faster routines you've written.<
> 
> Sean Kelly>I'd be interested in seeing that.  I've been able to beat the D
> sort for some data sets and match it in others, but not beat it across the
> board.<
> 
> In those d libs ("func" module) there is the fastSort() too I was talking
> about, plus lot of other things [note: that sort of mine is really simple,
> and it looks much similar to the sort already present in Phobos, but it run
> faster to me. In the code you can find some speed tests too. It's a quicksort
> plus a Simple Insertion Sort. But if you need a faster sort I suggest to
> adapt something much better than the naive 60-lines long sort I have written,
> like the Timsort used by Python. I have tested the speed of my code only in
> few situations, not long and lot].

This is still pretty impressive. Results from a sort benchmark (and the 
benchmark itself, which requires Tango) are attached. Your code is quite short 
but still knocks the socks off the built-in sort.

It appears to do a lot of comparing, though---and thus, when made to use a 
custom comparing function instead of the built-in "<" operator it quickly slows 
down. It's still much better than the built-in sort (especially since the 
built-in sort can't even take a custom comparator), but for a more general sort 
it's not that great. Unless I completely messed things up when converting it to 
use a comparing function, that is. Which is entirely possible.

One other change I had to make is to add "j &&" to this line in _aqs, because 
when cmp was a function "return a > b" instead of "return a < b" it would result 
in array bounds errors:
do j--; while (j && cmp(a, v[j]));

I see that Sean has changed the Tango sort to use introspective sort as well, so 
the results for that don't apply any more 
(http://www.dsource.org/projects/tango/ticket/571) but they're included anyway.

Key for understanding the numbers:

For each algorithm, the first column set is for random, the second for already 
sorted, the third for sorted and then reversed, and the fourth for "median-of-3 
killer" data, specially crafted to bring worst-case behaviour in median-of-3 
quicksorts. The Tango sort used to hit a stack overflow and thus isn't tested 
with them.

In each column set, the first column is the average, the second the minimum, and 
the third the maximum time taken.

-- 
E-mail address: matti.niemenmaa+news, domain is iki (DOT) fi
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sort.d
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20080121/c3fcada9/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sort_results.txt
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20080121/c3fcada9/attachment.txt>


More information about the Digitalmars-d mailing list