Shuffle
Walter Bright
newshound1 at digitalmars.com
Fri Jan 25 12:05:07 PST 2008
Frits van Bommel wrote:
> As mentioned by Oskar, this won't generate uniformly random lists.
> However, both your algorithm and his version have an additional flaw:
> both shuffle the entire list when you only need a random prefix (i.e.
> you only need to select as many random files as will fit onto your
> second disk).
>
> Try this small modification to the original algorithm instead:
> -----
>
> /* The loop will normally quit via an exception when the target
> * device is full. But if there are not enough files in the source
> * to fill up the target, the loop ensures it will still eventually
> * quit.
> */
> while (files.length > 0)
> {
> auto j = std.random.rand() % files.length;
> auto fromfile = files[j];
> auto tofile = std.path.join(todir, basename(fromfile));
> writefln("%s => %s", fromfile, tofile);
> std.file.copy(fromfile, tofile);
>
> // Remove copied file from the list
> files[j] = files[$-1];
> files = files[0 .. $-1];
> }
> -----
You're right, but I prefer having the shuffle bit be separate so people
will see how to do a shuffle.
> Though technically, this will still prefer the files at the front of the
> list during each iteration because (rand() % files.length) is more
> likely to be a number <= (RAND_MAX/files.length) than a larger number.
> (In fact, for RAND_MAX < files.length the higher-ups will *never* be
> chosen during that iteration)
> However, this shouldn't be a significant problem as long as your number
> of files is much less than RAND_MAX; and RAND_MAX appears to be uint.max
> for std.random, so for any reasonable music collection it should be
> close enough.
You're right again. (This algorithm is obviously harder than it looks!)
> To fix that, ignore the result of rand() if it's > ((RAND_MAX /
> files.length) * files.length), for example by replacing the first line
> of the loop with:
> ---
> uint r = std.random.max;
> // assumes RAND_MAX == uint.max
> auto max = (uint.max / files.length) * files.length;
> while (r > max)
> r = std.random.rand();
> auto j = r % files.length;
> ---
> (also, add "assert(files.length <= RAND_MAX);" before the loop if
> RAND_MAX < uint.max)
I think it's (r >= max).
More information about the Digitalmars-d-announce
mailing list