random cover of a range

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Feb 12 16:22:15 PST 2009


dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
>> Ok, following Leonardo's and Denis' ideas on covering an immutable array
>> in random order, I thought about it and came up with one algorithm, in
>> addition to the obvious one that stores one integer per element (see
>> Denis' code).
>> Here's the algorithm. It stores one bit per element and takes O(n log n)
>> time to cover a range of length n.
>> 1. Given a random-access range r of length n, create a bitmap with one
>> bit per element in r. The range is initially all zeros.
>> 2. At each step, select a random index in the _original_ array, i.e. a
>> random number in 0 .. r.length. Then walk upwards (with wrapping at the
>> end of the range) until you find a bit = 0 in the bitmap. Mark that bit
>> = 1 and return the found element.
>> 3. The cover ends when all bits in the bitmap are 1.
>> I did a ballpark complexity estimate by approximating the average number
>> of steps in (2) with n / (n - k), where k is the number of 1s in the
>> bitmap. IIRC the average number of steps to hit a 1 in a binomial
>> distribution with p = (n - k) / n is 1/p. Right? In that case, the total
>> number of steps taken by the algo above is:
>> N = n / 1 + n / 2 + ... + n / (n - 1) = n * (1 + 1/2 + 1/3 + ... + 1/(n
>> - 1)) < O (n log n)
>> Makes sense?
>> If the range does not offer random access, I think this algorithm will
>> have quadratic complexity because it needs to make a number of steps
>> proportional to n at each iteration.
>> Andrei
> 
> The concern I have with this scheme is that iterating over an array with this
> cover would not produce an unbiased distribution of permutations.  A good
> shuffling algorithm will produce a perfectly uniform distribution over all
> possible permutations of the original range, assuming the underlying random number
> generator is unbiased.  In other words, given an array of length N, every
> permutation should have an equal probability 1/N!.
> 
> To get this unbiasedness, after selecting M values from a larger set N, the
> probabilities of selecting any of the remaining N - M elements as element M + 1
> must be uniform.  In this case, it is clearly not, since elements just after
> elements that have already been selected are "bigger targets".  Similar problems
> exist for naive implementations of the Fisher-Yates algorithm
> (http://en.wikipedia.org/wiki/Fisher-Yates_shuffle).  In other words, unless one
> is very careful, one will not get a truly unbiased, random permutation.
> 
> That said, for certain applications, what you propose may be "random enough."
> However, if you do include it, please at least document that it is biased and
> should not be used for things like monte carlo simulations or gambling, where
> people might care.

I can't believe I fell for this. I knew Fisher-Yates and used it in 
randomShuffle!

I won't put a subtly skewed distribution in the standard library. The 
quest for finding a random cover of an array with as little extra memory 
consumed and with good complexity is on! I'd appreciate any ideas.


Andrei



More information about the Digitalmars-d mailing list