Randomness in built-in .sort

Bill Baxter wbaxter at gmail.com
Mon Jan 5 12:58:46 PST 2009


On Mon, Jan 5, 2009 at 8:40 PM, Stewart Gordon <smjg_1998 at yahoo.com> wrote:
> Bill Baxter wrote:
> <snip>
>>
>> I think the built-in sort should be some kind of stable sort.   Also
>> the stability or lack thereof is not mentioned in the spec, and it
>> probably should be because stability is critical for some uses of
>> sorting.  One shouldn't have to guess whether a given implementation
>> of D will use a stable sort or not.
>>
>> --bb

> A stable sort can be slower than an unstable sort.  There are many cases
> where the order of equally-ranking keys does not matter or it has no effect
> whatsoever (e.g. sorting an array of integers).  Then why take the extra
> overhead of making the sort stable?

> We could have an extra property .stableSort, for those cases where you need
> stability.  Of course, it would be valid to implement .sort and .stableSort
> with the same algorithm, though to do so would be naive unless someone comes
> up with a stable sorting algorithm with O(n log n) or better time complexity
> and O(log n) or better extra memory requirement.  But in the absence of such
> an algorithm, the compiler could still replace .stableSort with .sort in
> those cases where it can guarantee it will make no difference to the end
> result.

Actually yeh, I don't really care so much if .sort is stable or not
(as long as it's clear in the spec what to expect).  But I do care
that it is deterministic.  The current implementation seems to use
some randomness (random partitions?) so if I run my program 5 times I
get 5 different sort orders when there are duplicated keys.  This
makes it really hard to debug things that only happen with certain
orders.

> Or you could argue that requiring sort to be stable is sufficiently uncommon
> that it might as well be left to a library function.

There is no one sort algorithm that is fastest for all inputs.  So it
would make more sense to me to argue that if you want the fastest
possible sort for your problem then you should go with a library
solution.  Built-ins should be efficient, yes, but to argue that
efficiency is more important than generality in a built-in seems wrong
to me.  A least for .sort.   There's no performance benefit to having
a built-in sort, so it's really only there for convenience.   Using a
stable sort would make .sort applicable to more problems, thus
increasing the integral of convenience.

--bb



More information about the Digitalmars-d mailing list