Parallel Rogue-like benchmark

Sat Nov 9 20:31:19 PST 2013

I imagine (although I haven't checked) that std.random.Xorshift32 
uses the algorithm:

         seed ^= seed << 13;
         seed ^= seed >> 17;
         seed ^= seed << 5;
         return seed;

while the levgen benchmarks use the algorithm:

         seed += seed;
         seed ^= (seed > int.max) ? 0x88888eee : 1;
         return seed;

The former produces better random numbers, but it's possible that 
it may be slower.

Lack of inlining would definitely make a huge difference. I wrote 
an assembly function for Go that was an exact copy of the 
assembly generated by the LLVM at O3, and it was no faster than 
the native Go function, even though the assembly was much better 
(assembly functions aren't inlined in Go). Changing the assembly 
function to generate and return two random numbers, however, 
increased the overall program speed by around 10%, highlighting 
the overhead of function calls and lack of inlining.

On Saturday, 9 November 2013 at 12:23:25 UTC, bearophile wrote:
> Joseph Rushton Wakeling:
>
>> How does the speed of that code change if instead of the 
>> Random struct, you use std.random.Xorshift32 ... ?
>
> That change of yours was well studied in the first blog post 
> (the serial one) and the performance loss of using Xorshift32 
> was significant, even with LDC2. I don't know why.