Randomness in built-in .sort

dsimcha dsimcha at yahoo.com
Sun Jan 4 19:05:40 PST 2009


== Quote from Bill Baxter (wbaxter at gmail.com)'s article
> It seems to me that the built-in .sort uses a randomized algorithm
> that leads to results that differ from run-to-run when there are
> elements with identical keys.
> This seems like a bad idea to me for the built-in algorithm to have
> non-deterministic behavior because it means you can have bugs that
> aren't consistently repeatable.
> I think the built-in sort should be some kind of stable sort.   Also
> the stability or lack thereof is not mentioned in the spec, and it
> probably should be because stability is critical for some uses of
> sorting.  One shouldn't have to guess whether a given implementation
> of D will use a stable sort or not.
> --bb

IDK why sorting is builtin anyhow.  I did some benchmarks a while back, and the
builtin sort is slower than std.algorithm.  IIRC the builtin sort uses function
pointers for comparison, while the std.algorithm sort uses template parameters
that can be inlined at compile time.  I know this was mentioned before, but it
didn't really get fully discussed.  Why is sorting builtin?  With property syntax,
there's not even any syntactic sugar advantage.  Furthermore, the builtin sort is
substantially less flexible than a library sort.

On another note, it would be nice if std.algorithm implemented a stable O(N log N)
sort, like a merge sort.  Right now, IIRC it uses an interesting stable swapping
algorithm on top of a quick sort for O(N log^2 N) performance.  This is good when
space is tight, but I think in most cases a merge sort is better as a stable sort.

On the other hand, if builtin sort is kept, then I agree with you:  It should
follow the D convention of doing the safe thing by default (stable sort with no
O(N^2) corner cases) and allowing less safe behavior (unstable but possibly faster
sort that may or may not have some O(N^2) corner cases) in libraries.



More information about the Digitalmars-d mailing list