Normal/Gaussian random number generation for D
jerro
a at a.com
Tue Nov 6 10:14:56 PST 2012
>> I don't see any downside to this.
>
> Which "this" do you mean? My current approach, or the adding
> of an extra separate function? :-)
Your current approach.
>> I have only been thinking about the Ziggurat algorithm, but
>> you are right, it
>> does depend on the details of the technique. For Box-Muller
>> (and other engines
>> that cache samples) it only makes sense to compute the first
>> samples in the
>> opCall. But for the Ziggurat algorithm, tables that must be
>> computed before you
>> can start sampling aren't changed during sampling and
>> computing the tables
>> doesn't require any additional arguments. So it makes the most
>> sense for those
>> tables to be initialized in the struct's constructor in the
>> struct based API.
> So we should assume by default then that the struct's
> constructor should take an RNG as input, to enable it to
> calculate these first values if it needs to?
No, I think that if the engine defines initialize() function
(with no parameters), it should be called in the constructor of
Normal. I don't think the constructor of Normal should take an
RNG as input. I think what you currently do in normal.d is fine,
I would just add something like this:
static if(is(typeof(_engine.initialize())))
_engine.initialize();
to Normal's constructor. Then ZigguratEngine can define
initialize() and do its initialization there (it doesn't need a
uniform RNG for initialization) and NormalBoxMullerEngine can
remain unchanged.
>> In my previous post I was talking about initializing static
>> instances of the
>> engine used in the normal() function. The advantage of
>> initializing in a static
>> constructor is that you don't need an additional check every
>> time the normal()
>> function is called. But because we will also have a struct
>> based API, that will
>> not require such checks (at least not for all engines), this
>> isn't really that
>> important. So we can also initialize the global engine
>> instance in a call to
>> normal(), if this simplifies things.
> I guess my feeling here is that the values generated by an RNG
> should depend on when it is called, and not at all on when it
> is instantiated.
>
> i.e. if I do something like
>
> auto nrng = Normal!()(0, 1);
> writeln( uniform(0.0, 1.0) );
> writeln( uniform(0.0, 1.0) );
> writeln( nrng() );
> writeln( nrng() );
>
> I should get the same output as if I do,
>
> writeln( uniform(0.0, 1.0) );
> writeln( uniform(0.0, 1.0) );
> auto nrng = Normal!()(0, 1);
> writeln( nrng() );
> writeln( nrng() );
>
> You can also think that if I change from e.g.
>
> auto nrng = Normal!(real, Engine1)(0, 1);
> writeln( uniform(0.0, 1.0) );
> writeln( uniform(0.0, 1.0) );
> writeln( nrng() );
> writeln( nrng() );
>
> to
>
> auto nrng = Normal!(real, Engine2)(0, 1);
> writeln( uniform(0.0, 1.0) );
> writeln( uniform(0.0, 1.0) );
> writeln( nrng() );
> writeln( nrng() );
>
> ... then I would expect to see different results from the
> normal RNG but identical results from uniform(). If the
> constructor of the normal engine calls the RNG, the uniform()
> results will change, no?
I was only talking about the part of initialization that doesn't
use a RNG. I agree that everything that uses a RNG should be done
in opCall (or inside a normal() function in the function
interface). For Box-Muller, I think the approach you currently
use in NormalBoxMullerEngine is the most reasonable one.
But a Ziggurat engines needs to compute some tables before it can
start generating samples. It doesn't need a RNG to do that and
the tables do not change after initialization.
I think it's obvious that that initialization that doesn't need a
RNG should be done in Normal's constructor for the struct
interface. What is not so obvious is where the initialization of
static data that doesn't require a RNG should be done for
function interface. That initialization can be done in normal()
function, or it can be done in a static constructor. If you do it
in normal(), you need to do an extra check on each call to
normal(). This isn't really a problem as long as the struct's
opCall and the version of normal() that takes the engine as a
parameter don't do such redundant checks. Then the users that
care about the difference in performance can just use one of
these interfaces.
> Yes, a random.d test suite probably should be another project.
>
> Regardless of tests, let's focus for now on getting the API
> right for this case of
> non-uniform-random-number-generator-with-internal-engine, with
> normal and exponential as the initial cases.
In general I like the API in file normal.d attached to your
original post. I think the engines should have an option to do
some initialization in Normal's constructor, though. We could
achieve that by calling _engine.initialize in Normal's
constructors, if such method exists. This method would also need
to be called on the static instance of normal engine used in the
normal() function. We could add something like this to the first
version of normal:
static if(is(typeof(engine.initialize())))
{
static bool isInitialized;
if(!isInitialized)
engine.initialize();
}
Another option would be to do this:
struct GlobalEngine(Engine)
{
Engine instance;
static this()
{
instance.initialize();
}
}
And then inside the version of normal that doesn't take an engine
as a parameter:
alias GlobalEngine!NormalRandomNumberEngine E;
return normal(mean, sigma, E.instance, urng);
The users would need to construct their own instance of engine to
use the function that takes engine as a parameter. So it would
make sense to add helper functions for creating engine instances.
There's one change that I think would make the API more
convenient. Normal struct and the engine don't store an instance
of a RNG , so they don't need to take it as a template parameter.
We could make opCall methods templates instead. That way the
users would never need to explicitly specify the type of the RNG.
More information about the Digitalmars-d
mailing list