Go rant
Charles Hixson
charleshixsn at earthlink.net
Wed Jan 13 13:55:11 PST 2010
Don wrote:
> dsimcha wrote:
>> == Quote from Don (nospam at nospam.com)'s article
>>> "the bubble sort seems to have nothing to recommend it, except a
catchy
>>> name " - Knuth.
>>
>> Well, the bubble sort distance is a pretty good metric of how
similarly
>> ordered
>> two lists are. It's useful, for example, in statistics such as
Kendall's
>> Tau. One of the easiest ways to calculate it is...with a bubble
sort.
>
> I don't think that's a coincidence. It seems to be defined in terms
of
> bubble sort!
>
> The whole bubble sort phenonemon seems to be self-perpetuating.
There
> were loads of really crap algorithms in the 1950's. Why was this one
not
> forgotten with the rest?
Because it's SIMPLE. And because for the small sample sizes used in
most classes (still?) scaling with n^2 isn't all that bad. Someone
said that the insertion sort is better even for small sample sizes,
but if you don't have an array type that implements insertion, then
that means writing an extra loop to move things down. I might have
suggested that the exchange sort was a good choice. But with small
sample sizes and limited memory (no longer a significant problem for
cases where this is even vaguely reasonable) the bubble sort isn't
that bad. In fact it's difficult to measure the difference as n tends
towards 3. For n == 2 there's no other even vaguely reasonable
choice.
The problem with bubble sorts is the factor of n^2. My teacher might
well have been wrong when he put n as high as 25 for where the bubble
sort was optimal, but it's clearly optimal when n is less than 4.
It's just that those cases don't matter in the real world. (And
actually, they'd probably be handled with a case statement, or a
nested if.) But if you have code that needs to handle sorting array
sizes between 3 and 25 (varying in the calls, and tending towards the
low end), then there's no reason to go for anything more complicated.
Of course, a better choice yet is often to use a library function.
Even if the low end calls have excess overhead, it won't matter much,
and you don't need to worry about coding errors.
(Yeah, I believe that Knuth said there was no reason for the bubble
sort existing. And without knowing the exact context in which he said
it, I'm not going to call him wrong. The times to use it are not only
extremely limited, there's also a tendency to use it at the wrong
times, just because you have the code handy, and it's easy.)
P.S.: If you have an Array class, and the array class has an insert
method, then the insertion sort is clearly better for building a
sorted list. It's less clear if you're sorting a list that you
already have, and you're trying to avoid allocating new memory.
Especially if it's not stored a container class with an insert method.
N.B.: If the bubble sort weren't taught, it would be continually re-
invented. And as it's so rarely the proper choice, it needs to be
warned against. Preferably by having someone write something that,
sorts things interactively, and then having them start adding things
to be sorted. But processors may be fast enough to defeat this
experiential learning unless you add things in a way that doubles the
collection size each time you make an addition.
More information about the Digitalmars-d
mailing list