[OT] Algorithm question

MysticZach via Digitalmars-d digitalmars-d at puremagic.com
Mon May 1 09:56:58 PDT 2017


On Monday, 1 May 2017 at 04:15:35 UTC, H. S. Teoh wrote:
> Given a set A of n elements (let's say it's a random-access 
> range of
> size n, where n is relatively large), and a predicate P(x) that
> specifies some subset of A of elements that we're interested 
> in, what's
> the best algorithm (in terms of big-O time complexity) for 
> selecting a
> random element x satisfying P(x), such that elements that 
> satisfy P(x)
> have equal probability of being chosen? (Elements that do not 
> satisfy
> P(x) are disregarded.)

Here's how I would do it:

// returns a random index of array satisfying P(x), -1 if not 
found
int randomlySatisfy(A[] array) {
    if (array.length == 0)
       return -1;
    int i = uniform(0, array.length);
    return randomlySatisfyImpl(array, i);
}

// recursive function
private int randomlySatisfyImpl(A[] da, int i) {
    if (P(da[i]))
       return i;
    if (da.length == 1)
       return -1;

    // choose a random partition proportionally
    auto j = uniform(da.length - 1);
    if (j < i) {
       // the lower partition
       int a = randomlySatisfyImpl(da[0..i], j);
       if (a != -1) return a;
       else return randomlySatisfyImpl(da[i+1 .. da.length], j - 
(i + 1));
    }
    else {
       // higher partition, investigate in reverse order
       int a = randomlySatisfyImpl(da[i+1 .. da.length], j - (i + 
1));
       if (a != -1) return i +1 + a;
       else return i + 1 + randomlySatisfyImpl(da[0..i], j);
    }
}

The goal is to have the first hit be the one you return. The 
method: if a random pick doesn't satisfy, randomly choose the 
partition greater than or less than based on 
uniform(0..array.length-1), and do the same procedure on that 
partition, reusing the random index to avoid having to call 
uniform twice on each recursion (one to choose a partition and 
one to choose an index within that partition). If the probability 
of choosing a partition is precisely proportional to the number 
of elements in that partition, it seems to me that the result 
will be truly random, but I'm not sure.



More information about the Digitalmars-d mailing list