Regarding implementing a stable sort for Phobos
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Tue Mar 13 07:31:59 PDT 2012
On 3/13/12 1:31 AM, Xinok wrote:
> I've been playing with sorting algorithms a lot in recent months, so I
> want to implement a *working* stable sort for Phobos which is broken at
> the moment. I have a working library and I'm still adding to it. It's
> much more complex than a simple merge sort, being over 300 lines of code
> at the moment.
Working is better than broken.
> - It's a natural merge sort, which is faster on partially sorted lists,
> and adds little overhead for mostly random lists.
> - It uses O(log n log n) additional space for merging.
That's 1024 when n is 4 billion. I think you can safely approximate it
with alloca or a fixed-size stack-allocated array.
> - I wrote it to sort random-access ranges *without* slicing, but I think
> the exclusion of slicing makes it slower. I'm writing a separate
> implementation which uses slicing and I'll keep it if it's much faster.
Having random access implies having slicing.
> - To avoid multiple allocations, the user can allocate their own
> temporary memory and pass it to the sort function.
If you need different allocation strategies, I suggest you make it a
policy (like stability is).
> - I decided against using pointers. While I could use them to write
> highly optimized code for arrays, pointers can't be used in safe code
> and don't work very well in CTFE at the moment.
Perhaps it's best to have two distinct implementations guarded by if
(__ctfe). The runtime implementation can be @trusted.
> Is it worth keeping the implementation *without* slicing? Many functions
> in Phobos do require slicing, including the unstable sort, and I think
> most random-access ranges do or could support slicing.
No need.
> What would you prefer in terms of memory usage vs performance? O(n/2)
> space is optimal for performance, O(1) (in-place) requires zero
> allocations but is slower, and O(log n log n) provides a good balance.
The latter rounded up to a constant sounds best.
> Should I implement concurrency? Merge sort is very easy to parallelize,
> and being in-place or natural doesn't change this fact.
Let's save that for std.parallel_algorithm.
> Should we take a different path and go for a basic merge sort or even
> Timsort? I've considered writing a Timsort though that seems like
> daunting task to me, so here's an implementation written in C -
> https://github.com/swenson/sort
I don't know how your function's performance profile compares with
timsort's.
Andrei
More information about the Digitalmars-d
mailing list