Faster uniform() in [0.0 - 1.0(

Tue Nov 23 05:04:38 PST 2010

On 23-nov-10, at 13:12, tn wrote:

> Fawzi Mohamed Wrote:
>
>>
>> On 23-nov-10, at 10:20, tn wrote:
>>
>>> bearophile Wrote:
>>>
>>>> Don:
>>>>
>>>>> Since the probability of actually generating a
>>>>> zero is 1e-4000, it shouldn't affect the speed at all <g>.
>>>>
>>>> If bits in double have the same probability then I think there is a
>>>> much higher probability to hit a zero, about 1 in 2^^63, and I'm
>>>> not counting NaNs (but it's low enough to not change the substance
>>>> of what you have said).
>>>
>>> For uniform distribution different bit combinations should have
>>> different probabilities because floating point numbers have more
>>> representable values close to zero. So for doubles the probability
>>> should be about 1e-300 and for reals about 1e-4900.
>>>
>>> But because uniform by default seems to use a 32 bit integer random
>>> number generator, the probability is actually 2^^-32. And that is
>>> actually verified: I generated 10 * 2^^32 samples of
>>> uniform!"[]"(0.0, 1.0) and got 16 zeros which is close enough to
>>> expected 10.
>>>
>>> Of course 2^^-32 is still small enough to have no performance
>>> penalty in practise.
>>>
>>> -- tn
>>
>> that is the reason I used a better generation algorithm in blip (and
>> tango) that guarantees the correct distribution, at the cost of being
>> slightly more costly, but then the basic generator is cheaper, and if
>> one needs maximum speed one can even use a cheaper source (from the
>> CMWC family) that still seems to pass all statistical tests.
>
> Similar method would probably be nice also in phobos if the speed is  
> almost the same.

Yes, I was thinking of porting my code to D2, but if someone else  
wants to do it...
please note that for double the speed will *not* be the same, because  
it always tries to guarantee that all bits of the mantissa are random,  
and with 52 or 63 bits this cannot be done with a single 32 bit random  
number.

>> The way I use to generate uniform numbers was shown to be better (and
>> detectably so) in the case of floats, when looking at the tails of
>> normal and other distributions generated from uniform numbers.
>> This is very relevant in some cases (for example is you are  
>> interested
>> in the probability of catastrophic events).
>>
>> Fawzi
>
> Just using 64 bit integers as source would be enough for almost(?)  
> all cases. At the current speed it would take thousands of years for  
> one modern computer to generate so much random numbers that better  
> resolution was justifiable. (And if one wants to measure probability  
> of rare enough events, one should use more advanced methods like  
> importance sampling.)

I thought about directly having 64 bit as source, but the generators I  
know were written to generate 32 bit at a time.
Probably one could modify CMWC to work natively with 64 bit, but it  
should be done carefully.
So I simply decided to stick to 32 bit and generate two of them when  
needed.
Note that my default sources are faster than Twister (the one that is  
used in phobos), I especially like CMWC (but the default combines it  
with Kiss for extra safety).

> -- tn