Timsort vs some others

Xinok xinok at live.com
Fri Dec 21 10:12:42 PST 2012


On Wednesday, 19 December 2012 at 05:48:04 UTC, Andrei 
Alexandrescu wrote:
> On 12/18/12 11:37 PM, Xinok wrote:
>> On Wednesday, 19 December 2012 at 02:00:05 UTC, Andrei 
>> Alexandrescu wrote:
>>> You don't need to choose a median - just sort the data 
>>> (thereby making
>>> progress toward the end goal) and choose the middle element.
>>
>> I don't think it would work well in practice, but I'll put 
>> something
>> together to see if the idea does have merit.
>
> I mostly fear cache touching issues.
>
> Andrei

I based my little experiment on my 'unstablesort' module, located 
here:
https://github.com/Xinok/XSort/blob/master/unstablesort.d

The results (sorting a random array of 1024*1024 uints):

Median of Five:
142ms     21627203 comps

Median of log n:
152ms     20778577 comps

The code:

size_t choosePivot(R range)
{
	import std.math;
	// Reserve slice of range for choosing pivot
	R sub = range[0 .. cast(uint)log2(range.length) | 1];
	// Pull in equally distributed elements
	swap(sub[sub.length - 1], range[range.length - 1]);
	foreach(i; 1 .. sub.length - 1) swap(sub[i], range[range.length 
/ sub.length * i]);
	// Sort sublist to choose pivot
	insertionSort(sub);
	// Move partitionable elements
	sub = sub[piv + 1 .. sub.length];
	foreach(i; 0 .. sub.length) swap(sub[i], range[range.length - 
sub.length + i]);
	// Return index of pivot
	return sub.length / 2;
}

My thoughts, while the idea does have merit, I think the median 
of five does a good job as it is. If you're interested in 
reducing the number of comparisons, replacing 
"optimisticInsertionSort" in std.algorithm with a binary 
insertion sort will do much more to achieve that goal. And if 
you're interested in O(n log n) running time, then add heap sort 
as a fall-back algorithm, as I did in my module (I actually plan 
to do this myself ... eventually).


More information about the Digitalmars-d mailing list