[OT] Algorithm question
John Colvin via Digitalmars-d
digitalmars-d at puremagic.com
Mon May 1 01:42:05 PDT 2017
On Monday, 1 May 2017 at 04:15:35 UTC, H. S. Teoh wrote:
> I'm been thinking about the following problem, and thought I'd
> pick the brains of the bright people around these parts...
>
> Given a set A of n elements (let's say it's a random-access
> range of
> size n, where n is relatively large), and a predicate P(x) that
> specifies some subset of A of elements that we're interested
> in, what's
> the best algorithm (in terms of big-O time complexity) for
> selecting a
> random element x satisfying P(x), such that elements that
> satisfy P(x)
> have equal probability of being chosen? (Elements that do not
> satisfy
> P(x) are disregarded.)
>
> Which elements of A satisfy P(x) is not known ahead of time,
> nor is the
> relative proportion of elements that satisfy P(x) or not.
>
> Note that A is considered to be immutable (swapping elements or
> otherwise changing A is not allowed).
>
> So far, I have come up with the following algorithms:
>
> 1) "Random shooting":
>
> auto r = ... /* elements of A */
> for (;;) {
> auto i = uniform(0, r.length);
> if (P(r[i])) return r[i];
> }
>
> Advantages: If a large percentage of elements in A satisfy
> P(x), then
> the loop will terminate within a small number of iterations;
> if the
> majority of elements satisfy P(x), this will terminate well
> below n
> iterations.
>
> Disadvantages: If only a small percentage of elements
> satisfy P(x),
> then this loop could take arbitrarily long to terminate, and
> it will
> not terminate at all if no elements satisfy P(x).
You can recover O(n) calls to P by caching the results. You can
recover O(n) for both P and rng by progressively removing
elements from r or any other method that results in an easily
samplable set of all elements of r not yet visited (and therefore
possible candidates).
I like this one (warning, just thought of, untested):
auto r = ... /* elements of A */
auto nRemaining = r.length;
while (nRemaining) {
auto i = uniform(0, nRemaining);
if (P(r[i])) return r[i];
else swap(r[i], r[--nRemaining]);
}
> 3) Permutation walk:
>
> auto r = ... /* elements of A */
> foreach (i; iota(0 .. r.length).randomPermutation) {
> if (P(r[i])) return r[i];
> }
> /* no elements satisfy P(x) */
>
> Advantages: if an element that satisfies P(x) is found
> early, the
> loop will terminate before n iterations. This seems like the
> best of
> both worlds of (1) and (2), except:
>
> Disadvantages: AFAIK, generating a random permutation of
> indices from
> 0 .. n requires at least O(n) time, so any advantage we may
> have had
> seems to be negated.
>
> Is there an algorithm for *incrementally* generating a random
> permutation of indices? If there is, we could use that in (3)
> and thus achieve the best of both worlds between early
> termination if an element satisfying P(x) is found, and
> guaranteeing termination after n iterations if no elements
> satisfying P(x) exists.
>
>
> T
Yes. As a matter of fact it would be the logical extension of my
algorithm above for when you can't or don't want to change r
(again, only just thought of this, untested...):
auto r = ... /* elements of A */
auto indices = iota(r.length).array;
auto nRemaining = r.length;
while (nRemaining) {
auto i = uniform(0, nRemaining);
if (P(r[indices[i]])) return r[indices[i]];
else swap(indices[i], indices[--nRemaining]);
}
You could also do the same thing but with an array of pointers to
elements of r instead of indices.
More information about the Digitalmars-d
mailing list