Random sampling in Phobos
Joseph Rushton Wakeling
joseph.wakeling at webdrake.net
Tue Apr 17 15:36:55 PDT 2012
On 17/04/12 18:25, Joseph Rushton Wakeling wrote:
> On 17/04/12 17:31, Andrei Alexandrescu wrote:
>> Actually that's not correct. RandomSample works fine with an input range and
>> does not keep it in memory.
>
> Ahh, OK. I should have anticipated this as the output also works as a range.
A query on this point, as it looks like this can have some unfortunate side-effects.
If I write e.g.
auto s = randomSample(iota(0,100), 5);
foreach(uint i; s)
writeln(i);
writeln();
foreach(uint i; s)
writeln(i);
... then it's noticeable that the _first_ element of the two sample lists is
identical, while the others change. If I put in place a specified source of
randomness, e.g.
auto s = randomSample(iota(0,100), 5, rndGen);
... then the numbers stay the same for both lists.
I'm presuming that the first element remains identical because it's defined in
the constructor rather than elsewhere, but it's not clear why passing a defined
RNG produces identical output on both occasions.
The effect can be worse with Vitter's Algorithm D because often, having defined
the first element, the second may be derived deterministically -- meaning the
first 2 elements of the sample are identical.
I'm sure that the above use of the RandomSample struct is not the advised use,
but it's still disconcerting to see this.
More information about the Digitalmars-d
mailing list