[Issue 15583] [REG] topN without uniform can show quadratic performance

via Digitalmars-d-bugs digitalmars-d-bugs at puremagic.com
Mon Jan 18 12:10:16 PST 2016


https://issues.dlang.org/show_bug.cgi?id=15583

--- Comment #2 from Ivan Kazmenko <gassa at mail.ru> ---
(copied from forum post:
http://forum.dlang.org/post/gwscmfifeoefrsudbgqa@forum.dlang.org
- as it is more relevant here)

Perhaps I should include a textual summary as well.

The code on DPaste starts by constructing an array of Elements of size MAX_N;
in the code, MAX_N is 50_000.  After that, we run the victim function on our
array.  Here, the victim is topN (array, MAX_N / 2); it could be sort (array)
or something else.

An Element contains, or rather, pretends to contain, an int value.  Here is how
Element is special: the result of comparison for two Elements is decided
on-the-fly.  An Element can be either UNDECIDED or have a fixed value. 
Initially, all elements are UNDECIDED.  When we compare two Elements and at
least one of them has a fixed value, the comparison is resolved naturally, and
UNDECIDED element is greater than any fixed element.  When we compare two
UNDECIDED elements, the one which participated more in the comparisons so far
gets a fixed value: greater than any other value fixed so far, but still less
than UNDECIDED.  This way, the results of old comparisons are consistent with
the new fixed value.

Now, what do we achieve by running the victim function?  Turns out that the
algorithms using the idea of QuickSort or QuickSelect tend to make most
comparisons against their current pivot value.  Our Element responds to that by
fixing the pivot to one of the lowest available values.  After that, a
partition using such pivot will have only few, O(1), elements before the pivot,
and the rest after the pivot.  In total, this will lead to quadratic
performance.

After running the victim function on our array of Elements (which - careful
here - already takes quadratic time), we reorder them in their original order
(to do that, each Element also stores its original index).

Now, we can re-run the algorithm on the array obtained so far.  If the victim
function is (strongly) pure, it will inevitably make the same comparisons in
the same order.  The only difference is that their result will already be
decided.

Alternatively, we can make an int array of the current values in our array of
Elements (also in their original order).  Running the victim function on the
int array must also make the same (quadratic number of) comparisons in the same
order.

The inspiration is the paper "A Killer Adversary for Quicksort":
http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

--


More information about the Digitalmars-d-bugs mailing list