std.parallelism curious results

Sativa via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sun Oct 5 15:47:45 PDT 2014


On Sunday, 5 October 2014 at 21:53:23 UTC, Ali Çehreli wrote:
> On 10/05/2014 02:40 PM, Sativa wrote:
>
> >      foreach(i; thds) { ulong s = 0; for(ulong k = 0; k <
> > iter/numThreads; k++)
>
> The for loop condition is executed at every iteration and 
> division is an expensive operation. Apparently, the compiled 
> does some optimization when the divisor is known at compile 
> time.
>
> Being 4, it is just a shift of 2 bits. Try something like 5, it 
> is slow even for enum.
>
> This solves the problem:
>
>         const end = iter/numThreads;
>
>         for(ulong k = 0; k < end; k++) {
>
> Ali

Yes, it is a common problem when doing a computation in a for 
loop on the bounds. Most of the time they are constant for the 
loop but the compiler computes it every iteration. When doing a 
simple sum(when the loop does not do much), it becomes expensive 
since it is comparable to what is happening inside the loop.

It's surprising just how slow it makes it though. One can't 
really make numThreads const in the real world though as it 
wouldn't optimal(unless one had a version for each number of 
possible threads).

Obviously one can just move the computation outside the loop. I 
would expect better results if the loops actually did some real 
work.




More information about the Digitalmars-d-learn mailing list