Normal/Gaussian random number generation for D

Wed Oct 24 14:23:50 PDT 2012

> It looks like it should be readily possible to integrate the 
> core Ziggurat functionality and to convert the normal() 
> function in your code to be a NormalRandomNumberEngine struct

I already have a NormalDist struct at 
https://github.com/jerro/phobos/blob/new-api/std/random.d#L2363 - 
converting that should be trivial.

>
> For the other distributions, my feeling is that in some cases 
> there's a value in also having this "engine" approach, e.g. for 
> exponentially-distributed numbers one could use Ziggurat or one 
> could use the approach
>
>     T u = uniform!("[)", T, T, 
> UniformRandomNumberGenerator)(0.0, 1.0, urng);
>     return -log(1 - u)/lambda;
>
> ... which is not as fast but has a much lower memory footprint.

I agree that the it's a good thing to design the API so that we 
can use different engines

> Can you expand on this, and maybe provide a reference?  I don't 
> doubt your code's effectiveness but I think where RNGs are 
> concerned we really need to be able to justify our algorithmic 
> choices.  There's too much literature out there showing how 
> commonly-used algorithms actually carry statistical flaws.

I have only tested a distribution of the output samples, which 
seems to be correct. I agree that we should try to avoid 
statistical flaws as much as possible. Do you know of any good 
tests for nonuniform random number generators?

> Bigger picture on my approach to non-uniform random number 
> distributions.  The goal is to have the following:
>
>     * Where useful, it should be possible to define and use 
> multiple different
>       internal "engines" for generating random numbers from the 
> given
>       distribution
>
>     * For each distribution, there should be a function 
> interface and a struct
>       interface.
>
>     * The struct implementation should store the distribution 
> parameters and an
>       instance of the internal engine (if any).
>
>     * The function implementation should have 2 versions: one 
> which allows the
>       user to pass an engine of choice as input, one which 
> contains a static
>       instance of the specified engine (hence, thread-safe, 
> distinguished
>       according to both engine type and underlying uniform RNG 
> type).

>     * ... unless there's no call for distinct underlying 
> engines, in which case
>       the function version just takes parameters and uniform 
> RNG :-)
>
>     * The struct version should be useful to couple with an RNG 
> instance to
>       create an arbitrary random-number range à la Boost's 
> variate_generator
>       class.

Seems OK to me. Maybe the function that uses a static engine 
should have one version that takes Rng as a parameter and one 
that uses rndGen, similar to how uniform() works now?

How would the static engines be initialized? They could be 
initialized on first use, but this would slow sample generation a 
bit. Another options is to initialize them in the static 
constructor. I think it would be best to make the engine a static 
field of a struct template, and initialize it in the structs 
static constructor (similar to what my ZigguratTable struct 
does). That way you don't need to check if the engine is 
initialized every time you generate a sample and the engine isn't 
initialized unless the function template that uses it is 
instantiated.

> So, a nice way to expand on our respective approaches might be 
> to incorporate your Ziggurat, adapt it to my Normal engine API, 
> and for me to write a similar setup for 
> exponentially-distributed random numbers that uses the simple 
> approach above and can also use your Ziggurat implementation.
>
> What do you think?

I think we should do that. Do you already have some code 
somewhere where I can see it? I need to know what API the 
NormalRandomNumberEngine struct should have.