Scalability in std.parallelism

Mon Feb 24 02:17:11 PST 2014

On Saturday, 22 February 2014 at 16:21:21 UTC, Nordlöw wrote:
> In the following test code given below of std.parallelism I get 
> some interesting results:

Don't forget that "n.iota.map" is returning a lazily evaluated 
range.
Std.parallelism might have to convert the lazy range to a random 
access range (i.e. an array,) before it can schedule the work.

If I add ".array" after the map call (e.g. auto nums = 
n.iota.map!piTerm.array;)
I get numbers closer to the ideal for test2.

Now we compare the differences between test1 and test2:
test1 is reducing doubles and test2 is reducing ints.

I believe that the reason for the difference in speed up is 
because you have hyper threads and not true independent threads. 
Hyper threads can contend for shared resources in the CPU (e.g. 
cache and FPU.)

On my computer, forcing the nums to be a range of doubles in 
test2 causes the speed up to drop to approximately the same as 
test1.

Regards.