Random string samples & unicode

bearophile bearophileHUGS at lycos.com
Fri Sep 10 15:10:56 PDT 2010


The need to take a random sample without replacement is very common. For example this is how in Python 2.x I create a random string without replacement of fixed size from a input string of chars:

from random import sample
d = "0123456789"
print "".join(sample(d, 2))


This seems similar D2 code:

import std.stdio, std.random, std.array, std.range;
void main() {
    dchar[] d = "0123456789"d.dup;
    dchar[] res = array(take(randomCover(d, rndGen), 2));
    writeln(res);
}


There randomCover() doesn't work with a string, a dstrings or with a char[]. If later you need to process that res dchar[] with std.string you will have troubles.


But randomShuffle() is able to shuffle a char[] in place:

import std.stdio, std.random;
void main() {
    char[] d = "0123456789".dup;
    randomShuffle(d);
    writeln(d);
}


If randomCover() receives a char[] I think in theory it has to yield its shuffled chars. And if it receives a string it has to yield its shuffled dchars (converted from the chars). A string may contain UFT8 chars that are longer than 1 byte, but a char[] is not a string, and if you want its items in random order, it has to act like randomShuffle().

My head hurts, and I don't know what the right thing to do is.

Maybe I have to work with ubyte[] instead of char[], and add casts:

import std.stdio, std.random, std.array, std.range;
void main() {
    char[] d = "0123456789".dup;
    char[] res = cast(char[])array(take(randomCover(cast(ubyte[])d, rndGen), 2));
    writeln(res);
}


Ideas welcome.

Bye,
bearophile


More information about the Digitalmars-d mailing list