random cover of a range

dsimcha dsimcha at yahoo.com
Thu Feb 12 16:15:24 PST 2009


== Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
> Ok, following Leonardo's and Denis' ideas on covering an immutable array
> in random order, I thought about it and came up with one algorithm, in
> addition to the obvious one that stores one integer per element (see
> Denis' code).
> Here's the algorithm. It stores one bit per element and takes O(n log n)
> time to cover a range of length n.
> 1. Given a random-access range r of length n, create a bitmap with one
> bit per element in r. The range is initially all zeros.
> 2. At each step, select a random index in the _original_ array, i.e. a
> random number in 0 .. r.length. Then walk upwards (with wrapping at the
> end of the range) until you find a bit = 0 in the bitmap. Mark that bit
> = 1 and return the found element.
> 3. The cover ends when all bits in the bitmap are 1.
> I did a ballpark complexity estimate by approximating the average number
> of steps in (2) with n / (n - k), where k is the number of 1s in the
> bitmap. IIRC the average number of steps to hit a 1 in a binomial
> distribution with p = (n - k) / n is 1/p. Right? In that case, the total
> number of steps taken by the algo above is:
> N = n / 1 + n / 2 + ... + n / (n - 1) = n * (1 + 1/2 + 1/3 + ... + 1/(n
> - 1)) < O (n log n)
> Makes sense?
> If the range does not offer random access, I think this algorithm will
> have quadratic complexity because it needs to make a number of steps
> proportional to n at each iteration.
> Andrei

The concern I have with this scheme is that iterating over an array with this
cover would not produce an unbiased distribution of permutations.  A good
shuffling algorithm will produce a perfectly uniform distribution over all
possible permutations of the original range, assuming the underlying random number
generator is unbiased.  In other words, given an array of length N, every
permutation should have an equal probability 1/N!.

To get this unbiasedness, after selecting M values from a larger set N, the
probabilities of selecting any of the remaining N - M elements as element M + 1
must be uniform.  In this case, it is clearly not, since elements just after
elements that have already been selected are "bigger targets".  Similar problems
exist for naive implementations of the Fisher-Yates algorithm
(http://en.wikipedia.org/wiki/Fisher-Yates_shuffle).  In other words, unless one
is very careful, one will not get a truly unbiased, random permutation.

That said, for certain applications, what you propose may be "random enough."
However, if you do include it, please at least document that it is biased and
should not be used for things like monte carlo simulations or gambling, where
people might care.



More information about the Digitalmars-d mailing list