random k-sample of a file

BCS ao at pathlink.com
Thu Oct 9 14:13:03 PDT 2008


Reply to Andrei,

> I just ran across a nice little problem that I wanted to share as an
> exercise for the interested in futile pastimes.
> 
> The challenge, should you accept it, is to write a program that given
> a number k and a file, outputs k lines picked uniformly at random from
> the file. (If the file has less than k lines, it should all be
> output.) The file's size is unbounded so loading it in memory is not
> an option. How'd you go about it?
> 
> Andrei
> 

struct S
{
 char[] line;
 real v;
 int opCmp(S s)
 {
  if(s.v < this.v) return -1;
  else if(s.v > this.v) return +1;
  return 0;
 }
}

int k;
S[] buf = new S[k+1];

foreach(s;buf) s.v = real.max;

foreach(char[] line; stream)
{
    delete buf[$-1].line;
    buf[$-1].line = line.dup;;
    buf[$-1].v = reaqlRand();  // returns v > real.min and v < real.max
    buf.sort;
}

foreach(s;buf) if(s.v != real.max; writef("%s\n", s.line);





More information about the Digitalmars-d mailing list