how to get top N distinct elements from range?

jerro a at a.com
Fri Mar 8 06:43:27 PST 2013


On Friday, 8 March 2013 at 13:33:24 UTC, Andrea Fontana wrote:
> I wonder if exists a way to get top n distinct elements from a 
> range (infinite too!)

It's impossible to do that for infinite ranges

> A (not efficient) way to to this is 
> range.array.sort.uniq.take(n) but it's a bit overkill, it sorts 
> elements, and of course doesn't work with infinite ranges. Am i 
> missing any function?

You could do something like (untested code):

for(size_t m = 0, mnext = n; ;
     n = mnext, mnext = min(2 * mnext, range.length))
{
     partialSort!condition(range[m .. $], mnext - m);
     if(range[0 .. mnext].uniq.count >= n)
         break;

     assert(mnext < range.length);
}

auto topN = range.uniq.take(n);

You will need a random access range with slicing for that, and it 
will modify it. If I am not mistaken, this has time complexity 
O(log(ndup / n) * range.length), where ndup is the number of all 
elements (including duplicates) with values among the top n.


More information about the Digitalmars-d-learn mailing list