simultaneous multiple key sorting algorithm
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Fri Jan 27 18:22:26 PST 2012
On 1/27/12 8:33 PM, Peter Alexander wrote:
> On Friday, 27 January 2012 at 19:36:49 UTC, Andrei Alexandrescu wrote:
>> The brute force approach would essentially compute the two ranks and
>> then sort by their sum. That's three sorts and at least one temporary
>> array. But I think things can be done considerably better. Any ideas?
>
> Can you define "better"?
>
> Obviously you can't beat the O(n lg n) for comparison based ordering.
> The O(n) space may be avoidable, but it doesn't seem like the end of the
> world.
>
> Does this problem occur frequently enough in practice to justify having
> an optimised version in the standard library? Three sorts + O(n) space
> sounds fine to me.
I wasn't thinking necessarily of the standard library, but it is
something that is often needed even if not perceived as such. For
example, the stock screeners offered by financial sites are crappy -
they allow you to specify a range of e.g. P/E, P/book value, dividends
etc. etc. and sort by one of those. This is crappy - generally what one
wants is a selection of "best of breed" stocks that are among the top
ones in a variety of categories. (Relative importance could be assigned
to each category.)
A colleague at Facebook mentioned that you can use O(n) bucket sort for
the last-leg sorting.
The best approach I'm thinking of is to allocate an extra array of
size_t. First sort by k1 and then fill the array with 0, 1, ..., n-1.
Then sort simultaneously the two arrays by k2, and then add 0 to the
first slot, 1 to the second slot, ..., n-1 to the last slot. Continue
this by any number of keys, and finally sort using the secondary array
as a key. This works for any number of keys and only requires one extra
array of size_t.
The more interesting problem is yielding the answer lazily. I'm unclear
what the best algorithm for such would be.
Andrei
More information about the Digitalmars-d
mailing list