[Bench!][Mir] +54%..+185% performance boost for Mersenne Twister.

Tue Dec 13 15:18:26 PST 2016

On Tuesday, 13 December 2016 at 18:15:25 UTC, Joseph Rushton 
Wakeling wrote:
> On Saturday, 26 November 2016 at 20:13:36 UTC, Andrei 
> Alexandrescu wrote:
>> Congrats! Also thanks for using the Boost license which would 
>> allow backporting the improvements to Phobos. Who'd be up for 
>> it?
>
> I've finally found a moment to look into this (I'm at home 
> recovering from a seasonal virus, which ironically puts some 
> time and mental space in my hands).
>
> It makes for an interesting comparison.  The traditional 
> Mersenne Twister algorithm generates N random words in one go, 
> and then iterates over them, before generating another N random 
> words, and so on.  (Word == unsigned integer type.)
>
> By contrast Ilya's implementation just updates the single next 
> entry in the static array that stores the state of the engine.  
> This would seem to work out cheaper on average: perhaps because 
> the update-a-variate code stays hotter, perhaps because it 
> involves less branching and looping.  It certainly makes for a 
> simpler, easier to follow update.
>
> However, it might be a good idea to benchmark this in a context 
> where there are a lot of random variates being generated, but 
> where other things are also happening that might render the 
> update code less "hot" than a straightforward `foreach` loop 
> benchmark.  (Maybe try it out with a Monte Carlo simulation 
> ...?)
>  The tradeoff between many cheap updates and one expensive one, 
> versus all updates being a little more expensive but less on 
> average, might not be viable in some practical use-cases.
>
> The Phobos implementation also includes a check on whether the 
> generator is properly seeded, which is performed in every 
> `popFront()`, which has a reasonably significant impact.  
> Ilya's implementation can get away with not performing that 
> check because it uses `@disable this();` to remove the 
> possibility of initializing the generator without it being 
> properly seeded.  The impact of that check can possibly be 
> lessened by making it a boolean comparison rather than the 
> existing `size_t` equality comparison.
>
>   -- action item here: could we deprecate `this()` in Phobos as 
> a precursor to removing
>      the opportunity to instantiate a generator without 
> properly initializing it?
>
> I'm going to try to put together a range-based version to see 
> if this also makes any difference.  I'll post some benchmarks 
> of my own once that's done, and if all looks good I'll try to 
> put a Phobos PR together.
>
> A few questions -- not blockers, but nice to understand:
>
>   * @Ilya, is this implementation your own design, or do you 
> have a reference
>     for the rationale behind this revised implementation of MT?

My own.

>   * Is there a particular reason why the `index` variable is a 
> `Uint` (i.e.
>     the word type of the generator) rather than a `size_t`?  I 
> presume there's
>     a motivation, since the casting back and forth in `opCall` 
> would otherwise
>     seem inconvenient.

If I remember correctly it is used with Using, so I use the same 
type.

>   * Is there a motivation for the reverse iteration through the 
> RNG state array?

Yes, it composes add/sub with cmp operations.

Ilya