sortUniq

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Thu Jan 22 14:29:58 PST 2015


On 1/22/15 2:06 PM, Justin Whear wrote:
> On Thu, 22 Jan 2015 13:40:56 -0800, Andrei Alexandrescu wrote:
>
>> There's this classic patter on Unix: |sort|uniq, i.e. sort some data and
>> only display the unique elements.
>>
>> What would be a better integrated version - one that does sorting and
>> uniq in one shot? I suspect the combination could be quite a bit better
>> than doing the two in sequence.
>>
>> A few google searches didn't yield much. Ideas?
>>
>>
>> Thanks,
>>
>> Andrei
>
> Efficiency improvement-wise, perhaps a generalization of a counting sort
> (http://en.wikipedia.org/wiki/Counting_sort), see "Variant algorithms."

Thanks. I'm thinking of something based on general comparisons. 
Something like a modified quicksort with fat pivot:

A. Partition into three subranges: R1<pivot, R2=pivot, R3>pivot

B. Recurse on R1 obtaining R11, a subrange of R1 containing only unique 
elements.

C. Sort-move-unique R2 into the portion after R11, obtaining the result.

A and B seem fine, I'm unclear about C. It would be an algorithm that 
sorts and simultaneously moves data in a range that overlaps the input.


Andrei



More information about the Digitalmars-d mailing list