Random sampling in Phobos
Lars T. Kyllingstad
public at kyllingen.net
Tue Apr 17 22:50:09 PDT 2012
On Wednesday, 18 April 2012 at 05:45:11 UTC, Lars T. Kyllingstad
wrote:
> On Wednesday, 18 April 2012 at 01:17:18 UTC, Joseph Rushton
> Wakeling wrote:
>> On 17/04/12 18:25, Joseph Rushton Wakeling wrote:
>>> On 17/04/12 17:31, Andrei Alexandrescu wrote:
>>>> Actually that's not correct. RandomSample works fine with an
>>>> input range and
>>>> does not keep it in memory.
>>>
>>> Ahh, OK. I should have anticipated this as the output also
>>> works as a range.
>>
>> A query on this point, as it looks like this can have some
>> unfortunate side-effects.
>>
>> If I write e.g.
>>
>> auto s = randomSample(iota(0,100), 5);
>>
>> foreach(uint i; s)
>> writeln(i);
>>
>> writeln();
>>
>> foreach(uint i; s)
>> writeln(i);
>>
>> ... then it's noticeable that the _first_ element of the two
>> sample lists is identical, while the others change. If I put
>> in place a specified source of randomness, e.g.
>>
>> auto s = randomSample(iota(0,100), 5, rndGen);
>>
>> ... then the numbers stay the same for both lists.
>>
>> I'm presuming that the first element remains identical because
>> it's defined in the constructor rather than elsewhere, but
>> it's not clear why passing a defined RNG produces identical
>> output on both occasions.
>>
>> The effect can be worse with Vitter's Algorithm D because
>> often, having defined the first element, the second may be
>> derived deterministically -- meaning the first 2 elements of
>> the sample are identical.
>>
>> I'm sure that the above use of the RandomSample struct is not
>> the advised use, but it's still disconcerting to see this.
>
> I've run into this trap more than once. :) You have to pass
> the random number generator by ref, otherwise you are just
> generating two identical sequences of random numbers. Just
> change the randomSample signature to:
>
> auto randomSampleVitter(R, Random)(R r, size_t n, ref
> Random gen)
>
> -Lars
Hmmm... I see now that randomSample() is defined without ref in
Phobos as well. Would you care to check whether my suggestion
fixes your problem? If so, it is a bug in Phobos that should be
fixed.
-Lars
More information about the Digitalmars-d
mailing list