random k-sample of a file

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Thu Oct 9 14:07:59 PDT 2008


bearophile wrote:
> Andrei Alexandrescu:
>> But where's the D code you guys?
> 
> My third converted to D, using my libs:
> 
> import std.conv, d.all;
> 
> void main(string[] args) { assert(args.length == 3); string filename
> = args[1]; int k = toInt(args[2]); assert (k > 0);
> 
> string[] chosen_lines; foreach (i, line; xfile(filename)) if (i < k) 
> chosen_lines ~= line; else if (fastRandom() < (1.0 / (i+1))) 
> chosen_lines[randInt(k-1)] = line;
> 
> putr(chosen_lines); }
> 
> 
>> D is better than Python at scripting. Ahem.
> 
> I know many languages, and so far I have never found something better
> than Python to create working prototypes.

True men only write D. Your solution looks eerily similar to mine.

#!/usr/bin/rdmd

import std.stdio, std.contracts, std.conv, std.random;

void main(string[] args) {
     invariant k = parse!(uint)(args[1]);
     enforce(k > 0, "Must pass a strictly positive selection size");
     string[] selection;
     auto gen = Random(unpredictableSeed);

     foreach (ulong tally, char[] line; lines(stdin)) {
         if (selection.length < k) {
             // Selection not full; add to it
             selection ~= line.idup;
         } else {
             auto t = uniform(gen, 0, tally + 1);
             if (t > k) continue; // no luck
             // Replace a random element in the selection
             selection[uniform(gen, 0, k - 1)] = line.idup;
         }
     }

     // Print selection
     foreach (s; selection) write(s);
}



More information about the Digitalmars-d mailing list