randomSample
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Sat May 23 18:51:46 PDT 2009
dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
>> I'm writing a range to select a random sampling of a forward range. The
>> number of elements in the forward range is already known.
>> The strategy is: at any moment, I know I need to select k random samples
>> out of n. To implement popFront, I generate a random number in [0, n -
>> k]. That's the probability that the next item in the input will be part
>> of the selection. If the random number is 0, then the element got
>> selected. Otherwise, I must ignore that guy so I popFront the input
>> range, decrease n by one, and repeat.
>> This seems reasonable but somehow the selection comes skewed: the
>> elements towards the beginning of the range are less represented than
>> the ones towards the end. Where am I wrong?
>> Andrei
>
> I'm confused. As far as I can tell, RandomCover is correct in 2.030. I both read
> the code and did a monte carlo test for some short sequences to see if the
> distribution was uniform on the space of possible permutations. Isn't this just a
> case of using RandomCover, but stopping before iterating through the entire range?
> In other words, wouldn't the following work:
>
> auto selectK(R)(R someRange, uint K, Random gen) {
> assert(K <= someRange.length);
> return take(K, randomCover(someRange, gen));
> }
You are right. I'm just less than happy with randomCover because it
needs additional memory, so I eliminated it as a possibility from the
get-go.
Andrei
More information about the Digitalmars-d
mailing list