Challenge: fair partition function
Xinok via Digitalmars-d
digitalmars-d at puremagic.com
Mon Feb 8 18:27:29 PST 2016
On Monday, 8 February 2016 at 23:25:00 UTC, Andrei Alexandrescu
wrote:
> Consider defining a function that partitions a range around a
> given index like this:
>
> size_t pivotPartition(alias less = (a, b) => a < b, Range)
> (Range r, size_t pivot);
>
> Returns x, one of the the indexes that r[pivot] would have if r
> were sorted. Also, r[0 .. x] contains stuff no greater than
> r[x] and r[x + 1 .. $] contains stuff no less than r[x].
>
> The challenge is to implement such a function with fairness: if
> several elements are equal to r[pivot], return the index
> closest to r.length / 2.
>
> The function should be efficient, minimize calls to less and
> swap, etc. A variant that is efficient but does not implement
> fairness is below.
>
> ...
Ultimately, I think you're going to have to perform two
comparisons on *some* elements to check for equality. I thought
of a few tricks which may or may not help.
(1) Keep your code as is and add a final pass to count the number
of elements. However, you only need to scan the larger partition
since it contains the index (r.length / 2) and you're trying to
move closer to that point. The elements in the smaller partition
can simply be ignored.
(2) You may have code which looks something like this:
if(less(e, pivot)){ /+ less than +/ }
else if(less(pivot, e)){ /+ greater than +/ }
else{ /+ equal to +/ }}
This is a big win if most of the elements are less than the
pivot, but also a big loss if most of the elements are greater
than the pivot. A simple trick is to repeat the code twice but
swap the order of comparisons so you get:
if(less(pivot, e)){ /+ greater than +/ }
else if(less(e, pivot)){ /+ less than +/ }
else{ /+ equal to +/ }}
This will place us closer to an average 1.5 comparisons per
element which is better than (1).
(3) Similar to (2) except you only compare each element once so
you're not checking for equality. You're simply alternating
between checking if (e < pivot) or if (pivot < e). So the code
may look something like this:
if(less(pivot, e)){ less }
else{ greater or equal }
if(less(e, pivot)){ greater }
else{ less or equal }
From this, you can group the elements into four partitions:
[LE L G GE]
So you have elements which are definitely less than or greater
than the pivot but also some elements which *may* be equal to the
pivot. Combine this with the first trick (1) and you only have to
scan LE or GE for equality but not both. This is putting us
closer to an average 1.25 comparisons per element but a 4-way
partition would also require much more work.
More information about the Digitalmars-d
mailing list