std.random2

Wed Jan 9 13:15:04 PST 2013

I wrote what became the tango random* modules.
In it I did use a struct based approach for the core thing, but the
actual generator is then a class.

This because that way I was able to efficiently combine several
generators, and even thread safety, while keeping the actual object used
a class.
In my opinion reference semantic is the correct semantic for random
number generator
because normally you want random numbers, and repeating the sequence
(because you unwittingly copied the rng) is an error.
To repeat the sequence one should have functions to store the state and
reset with a stored state.

I have implemented gaussian distribution and a couple of others, and
also other generators, because I find the twister one very inflexible,
yes one would like a large internal state to be able to generate more
random sequences, and avoid collisions, but sometime one doesn't
want a very large state if he needs many different generators, or needs
to store and restart generators often.
 From this discussion you might imagine that the default generator is not
twister based.

I did use a final class for performance reasons, one could create
a non templatized superclass to let non perfrmance sensitive users
avoid templatization.

I wrote everything for D1, and Tango-D2 has the D2 port, but most likely
some decisions should be revised for a real D2 rewrite.
I am willing to give my code with the phobos license, so feel free to
borrow.

I wrote the code splitting it up in several modules with a main "exporter"
module for the normal user. I like this approach (and often also splitting
tests outside the main module so they aren't parsed in the normal case),
but that is obviously agianst the phobos philosophy.

ciao
Fawzi
On 01/08/13 18:52, Joseph Rushton Wakeling wrote:
> Hello all,
>
> Following discussion on the pull request adding normal random number 
> generation to std.random [ 
> https://github.com/D-Programming-Language/phobos/pull/1029 ], some 
> issues were raised which are best discussed with the whole community.
>
> The heart of this is the design of pseudo-random number generators 
> (PRNGs) in Phobos.  Currently these are implemented as value types, 
> which has a number of unpleasant effects:
>
>    -- They're expensive to pass around (some PRNGs have a size of a MB 
> or more)
>
>    -- Passing by value is statistically unsafe, as it can result in 
> identical
>       random sequences being generated in different parts of the 
> code.  This
>       already affects at least one part of std.random itself: see,
> http://d.puremagic.com/issues/show_bug.cgi?id=8247
> http://forum.dlang.org/thread/oiczxzkzxketxitncghl@forum.dlang.org
>
>    -- Simply passing by reference isn't an adequate solution, as there 
> will be
>       cases (as with RandomSample, detailed in bug 8247) where you 
> have to
>       store the RNG.  Storing e.g. a pointer or reference would be 
> unsafe; the
>       only adequate solution is that PRNGs be (safe) reference types.
>
> monarch_dodra did some work on this which was set aside in the short 
> term because it would (unavoidably) be a breaking change [ 
> https://github.com/D-Programming-Language/phobos/pull/893 ].  To avoid 
> this, the proposed solution is to create a std.random2.
>
> However, these issues seem to me to have broader implications for the 
> design of random-number functionality in Phobos, so if we're going to 
> do a std.random2, it's worth mapping them out so as to get them right.
>
> The most obvious (to me) is that these issues which apply to PRNGs 
> apply equally well to random number distributions.  For example the 
> Ziggurat algorithm requires storing several hundred constants, so 
> passing by value is expensive; and several different algorithms 
> generate and store multiple random variates at a time, so 
> copying/passing by value will result in unintended correlations in 
> sequences of variates.
>
> This has further implications if for example we want to create a 
> VariateGenerator (or perhaps in D-ish terms, a VariateRange) which 
> couples a random distribution with a PRNG -- this is unlikely to work 
> unless both the random distribution and the PRNG are reference types.
>
> Finally, there are more general issues about how new functionality 
> should be implemented.  C++11 is given as a model in the std.random 
> documentation, but this is clearly a guide rather than something to 
> copy blindly -- existing functions and structs already deviate from it 
> in ways that reflect D's preference for ranges and its superior 
> generics.  We need a clear picture of how to do this for functionality 
> that has not yet been implemented.
>
> For example: in the case of random distributions, the current example 
> of uniform() offers only a function interface and therefore little 
> guidance about how to create struct implementations for more complex 
> algorithms which require persistent storage (e.g. Ziggurat or 
> Box-Muller).  Should they follow C++11/Boost.Random in returning 
> variates via opCall, or should they be coupled with a PRNG at 
> construction time (as with RandomSample) and implement a range of 
> variates?
>
> My inclination here is to take some time to map out the different 
> interface/design options and to present the choices to the community 
> for review as a precursor to creating a std.random2.  It seems the 
> only really sensible choice to ensure that we get a good future-proof 
> design.
>
> What does everyone think?
>
> Best wishes,
>
>     -- Joe