sortUniq
H. S. Teoh via Digitalmars-d
digitalmars-d at puremagic.com
Thu Jan 22 15:33:14 PST 2015
On Thu, Jan 22, 2015 at 02:51:38PM -0800, H. S. Teoh via Digitalmars-d wrote:
> On Thu, Jan 22, 2015 at 02:29:58PM -0800, Andrei Alexandrescu via Digitalmars-d wrote:
> [...]
> > Thanks. I'm thinking of something based on general comparisons.
> > Something like a modified quicksort with fat pivot:
> >
> > A. Partition into three subranges: R1<pivot, R2=pivot, R3>pivot
> >
> > B. Recurse on R1 obtaining R11, a subrange of R1 containing only
> > unique elements.
> >
> > C. Sort-move-unique R2 into the portion after R11, obtaining the
> > result.
> >
> > A and B seem fine, I'm unclear about C. It would be an algorithm that
> > sorts and simultaneously moves data in a range that overlaps the
> > input.
> [...]
>
> Wait, if R1 and R3, after the recursion, are sorted and contain only
> unique elements, doesn't that mean that we can just collapse R2 into a
> single element along with the endpoints of R1 and R3? So if R1[$-1] <
> pivot, then we're done, otherwise coalesce it with the pivot, and
> likewise with R3[0].
[...]
Working proof of concept:
import std.algorithm;
import std.array;
import std.stdio;
// Returns: final index of pivot
size_t partition(R)(R r)
{
auto pivot = r[0];
swap(r[0], r[$-1]); // move pivot to end
auto leftIdx = 0;
foreach (i; 0 .. r.length)
{
if (r[i] < pivot)
swap(r[i], r[leftIdx++]);
}
swap(r[leftIdx], r[$-1]);
return leftIdx;
}
R uniqueSort(R)(R r)
{
if (r.empty) return r;
// Partition
auto pivotIdx = partition(r);
auto pivot = r[pivotIdx];
// Recurse
auto left = uniqueSort(r[0 .. pivotIdx]);
auto right = uniqueSort(r[pivotIdx + 1 .. $]);
// Coalesce left ~ pivot ~ right.
if (!left.empty && left[$-1] == pivot)
left = left[0 .. $-1];
if (!right.empty && right[0] == pivot)
right = right[1 .. $];
r[left.length] = pivot;
auto vacant = copy(right, r[left.length + 1 .. $]);
return r[0 .. $ - vacant.length];
}
unittest
{
auto data = [4, 0, 9, 7, 2, 5, 1, 1, 3, 4, 2, 0, 9, 5, 3, 6, 1, 7, 4, 8];
auto sorted = data.uniqueSort();
assert(sorted == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]);
}
I'm not sure about the performance of this algorithm, though, because
the call to copy() might become prohibitive once you add up all the
recursive calls to uniqueSort().
Also, I only got one test case working; no idea if there are subtle bugs
that are lurking in there somewhere. Please test with more test cases.
:-)
T
--
"No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
More information about the Digitalmars-d
mailing list