Sampling algorithms for D
bearophile
bearophileHUGS at lycos.com
Thu Apr 12 12:54:01 PDT 2012
Joseph Rushton Wakeling:
> https://github.com/WebDrake/SampleD
Some comments on the details of your code:
> import std.c.time;
In D there there is also the syntax:
> import std.c.time: foo, bar, baz;
That tells the person that reads the code what names are used.
----------------------
> interface Sampler(UniformRandomNumberGenerator) {
The "UniformRNG" name induces shorter lines in your module and I think it's easy enough to understand.
----------------------
> double uniform_open(UniformRandomNumberGenerator)
> (double a, double b, ref UniformRandomNumberGenerator urng)
> {
In D function/method names are pascalCased. I also suggest to indent the second line, and unless you plan to modifying a and b inside that, to always mark them as "in" (or const, or even immutable):
double uniformOpen(UniformRNG)
(double a, double b, ref UniformRNG urng)
{
----------------------
> do {
> x = uniform(a,b,urng);
> } while (x==a);
I suggest to put a space after a comma and before and after an operator:
do {
x = uniform(a, b, urng);
} while (x == a);
----------------------
> class Skipper(UniformRandomNumberGenerator) : Sampler!(UniformRandomNumberGenerator) {
When there is only one template argument you are allowed to omit the (), and generally in English there is no need for a space before a ":":
class Skipper(UniformRNG): Sampler!UniformRNG {
----------------------
> size_t S = skip(urng);
>
> sampleRemaining--;
> recordsRemaining -= (S+1);
> currentRecord += S;
Unless you plan to modify a variable, then on default define it as immutable, and where that's not possible, as const, this avoids mistakes later, makes the code simpler to understand, and helps the compiler too:
immutable size_t S = skip(urng);
sampleRemaining--;
recordsRemaining -= S + 1;
currentRecord += S;
----------------------
> final size_t records_remaining() { return recordsRemaining; }
> final size_t records_total() { return recordsTotal; }
Generally where possible add "const", "pure" and "nothrow" attributes to functions/methods every where you can (and in some cases @safe too, if you want):
final size_t records_remaining() const pure nothrow { return recordsRemaining; }
final size_t records_total() const pure nothrow { return recordsTotal; }
----------------------
> protected:
> size_t currentRecord = 0;
> size_t recordsRemaining;
All D variables are initialized (unless you ask them to not be initialized, using "= void"), so the first "= 0" is not needed (but initializing D floating point values to zero is often necessary):
protected:
size_t currentRecord;
size_t recordsRemaining;
----------------------
> y1 = pow( (uniform_open(0.0,1.0, urng) * (cast(double) recordsRemaining) / qu1),
> (1.0/(sampleRemaining - 1)) );
In D there is also the ^^ operator.
----------------------
> for( t=recordsRemaining-1; t>=limit; --t)
> y2 *= top--/bottom--;
Generally packing mutation of variables inside expressions is quite bad style. It makes code less easy to understand and translate, and currently it's not defined, just as in C (it will be defined, but it will keep being bad style).
for (t = recordsRemaining - 1; t >= limit; t--) {
y2 *= top / bottom;
top--;
bottom--;
}
Also in D there is (I hope it will not be deprecated) foreach_reverse(), maybe to replace this for().
----------------------
> writeln("Hello! I'm a very simple and stupid implementation of some clever");
> writeln("sampling algorithms.");
> writeln();
> writeln("What I'm going to do first is just take a sample of 5 records out of");
> writeln("100 using each of the algorithms.");
> writeln();
D string literals are multi-line too:
writeln(
"Hello! I'm a very simple and stupid implementation of some clever
sampling algorithms.
What I'm going to do first is just take a sample of 5 records out of
100 using each of the algorithms.
");
----------------------
> sampling_test_simple!(VitterA!Random,Random)(100,5,urng);
Currently your code doesn't work if you want to use a Xorshift generator.
----------------------
> sampling_test_aggregate!(VitterA!Random,Random)(100,5,urng,10000000,true);
> writeln();
> sampling_test_aggregate!(VitterD!Random,Random)(100,5,urng,10000000,true);
> writeln();
I suggest to define one enum inside the main(), like this, with underscores too, to improve readability, and use it in all the function calls:
enum int N = 10_000_000;
samplingTestAggregate!(VitterA!Random, Random)(100, 5, urng, N, true);
Bye,
bearophile
More information about the Digitalmars-d-learn
mailing list