Vitter's algorithm for random sampling

Jon Degenhardt jond at
Sat Sep 14 00:45:37 UTC 2019

On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu 
> Interesting. I think it has a place in std.random, and also it 
> might help improve some of the existing stuff in there.

tsv-sample in eBay's tsv utilities contains a number of sampling 
implementations, including several classic algorithms. They use 
std.random facilities wherever possible. It does not contain an 
implementation of Vitter's algorithm D, because algorithm D 
requires knowing the record set size ahead of time. However one 
of the algorithms, bernoulli skip sampling, does use the "skip" 
mechanism listed in section 2, para 2 of Vitter's paper.

The code documentation contains references for the different 
algorithms used. Like all the tsv-utils stuff, they are fast. No 
published benchmarks though.

* User documentation:
* Source code:
* Code documentation (via Adam's doc generator):


More information about the Digitalmars-d mailing list