Faster uniform() in [0.0 - 1.0(

tn no at email.invalid
Tue Nov 23 04:12:00 PST 2010


Fawzi Mohamed Wrote:

> 
> On 23-nov-10, at 10:20, tn wrote:
> 
> > bearophile Wrote:
> >
> >> Don:
> >>
> >>> Since the probability of actually generating a
> >>> zero is 1e-4000, it shouldn't affect the speed at all <g>.
> >>
> >> If bits in double have the same probability then I think there is a  
> >> much higher probability to hit a zero, about 1 in 2^^63, and I'm  
> >> not counting NaNs (but it's low enough to not change the substance  
> >> of what you have said).
> >
> > For uniform distribution different bit combinations should have  
> > different probabilities because floating point numbers have more  
> > representable values close to zero. So for doubles the probability  
> > should be about 1e-300 and for reals about 1e-4900.
> >
> > But because uniform by default seems to use a 32 bit integer random  
> > number generator, the probability is actually 2^^-32. And that is  
> > actually verified: I generated 10 * 2^^32 samples of  
> > uniform!"[]"(0.0, 1.0) and got 16 zeros which is close enough to  
> > expected 10.
> >
> > Of course 2^^-32 is still small enough to have no performance  
> > penalty in practise.
> >
> > -- tn
> 
> that is the reason I used a better generation algorithm in blip (and  
> tango) that guarantees the correct distribution, at the cost of being  
> slightly more costly, but then the basic generator is cheaper, and if  
> one needs maximum speed one can even use a cheaper source (from the  
> CMWC family) that still seems to pass all statistical tests.

Similar method would probably be nice also in phobos if the speed is almost the same.

> The way I use to generate uniform numbers was shown to be better (and  
> detectably so) in the case of floats, when looking at the tails of  
> normal and other distributions generated from uniform numbers.
> This is very relevant in some cases (for example is you are interested  
> in the probability of catastrophic events).
> 
> Fawzi

Just using 64 bit integers as source would be enough for almost(?) all cases. At the current speed it would take thousands of years for one modern computer to generate so much random numbers that better resolution was justifiable. (And if one wants to measure probability of rare enough events, one should use more advanced methods like importance sampling.)

-- tn


More information about the Digitalmars-d mailing list