[Bench!][Mir] +54%..+185% performance boost for Mersenne Twister.
Joseph Rushton Wakeling via Digitalmars-d
digitalmars-d at puremagic.com
Wed Dec 14 00:23:06 PST 2016
On Saturday, 26 November 2016 at 16:31:40 UTC, Ilya Yaroshenko
wrote:
> 1. Improve RNG generation performance by making code more
> friendly for CPU pipelining. Tempering (finalization)
> operations was mixed with internal payload update operations.
A note on this. The `opCall` (or, in the range version,
`popFront`) of Ilya's implementation mixes together two
superficially independent actions:
(1) calculating the current random variate from the current
index
of the internal state array;
(2) updating the current index of the internal state array, and
moving to the next entry.
It's straightforward to split out these two procedures into two
separate methods (or at least two clearly separated sequences
within the `opCall`), but doing so results in a notable
performance hit (on my machine, something in the order of 1 GB/s
less random bits).
Intertwining these steps in this way is therefore a very smart
optimization (although TBH it feels a little worrying that it's
necessary).
More information about the Digitalmars-d
mailing list