std.parallelism curious results
Sativa via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sun Oct 5 15:47:45 PDT 2014
On Sunday, 5 October 2014 at 21:53:23 UTC, Ali Çehreli wrote:
> On 10/05/2014 02:40 PM, Sativa wrote:
>
> > foreach(i; thds) { ulong s = 0; for(ulong k = 0; k <
> > iter/numThreads; k++)
>
> The for loop condition is executed at every iteration and
> division is an expensive operation. Apparently, the compiled
> does some optimization when the divisor is known at compile
> time.
>
> Being 4, it is just a shift of 2 bits. Try something like 5, it
> is slow even for enum.
>
> This solves the problem:
>
> const end = iter/numThreads;
>
> for(ulong k = 0; k < end; k++) {
>
> Ali
Yes, it is a common problem when doing a computation in a for
loop on the bounds. Most of the time they are constant for the
loop but the compiler computes it every iteration. When doing a
simple sum(when the loop does not do much), it becomes expensive
since it is comparable to what is happening inside the loop.
It's surprising just how slow it makes it though. One can't
really make numThreads const in the real world though as it
wouldn't optimal(unless one had a version for each number of
possible threads).
Obviously one can just move the computation outside the loop. I
would expect better results if the loops actually did some real
work.
More information about the Digitalmars-d-learn
mailing list