WTF! Parallel foreach more slower that normal foreach in multicore CPU ?
Ali Çehreli
acehreli at yahoo.com
Fri Jun 24 00:40:27 PDT 2011
On Thu, 23 Jun 2011 23:18:36 +0000, Zardoz wrote:
> Code :
> auto logs = new double[200];
> const num = 2;
> clock_t clk;
> double norm;
> double par;
> writeln("CPUs : ",totalCPUs );
> clk = clock();
> foreach(i, ref elem; logs) {
> elem = log(i + 1.0);
> }
> norm = clock() -clk;
> clk = clock();
> foreach(i, ref elem; taskPool.parallel(logs, 100)) {
> elem = log(i + 1.0);
> }
>
> I get same problem. Parallel foreach, is more slower that normal
> foreach. And it's same code that hace lib example that claims that
> parallel foreach do it in aprox. half time in Athlon X2
I was able to reproduce your results. I think there is a problem with
clock(). Try StopWatch:
import std.parallelism;
import std.stdio;
import std.math;
import std.datetime;
void main()
{
auto logs = new double[200_000_000];
writeln("CPUs : ",totalCPUs );
{
StopWatch stopWatch;
stopWatch.start();
foreach(i, ref elem; logs) {
elem = log(i + 1.0);
}
writeln(stopWatch.peek().msecs);
}
{
StopWatch stopWatch;
stopWatch.start();
foreach(i, ref elem; parallel(logs)) {
elem = log(i + 1.0);
}
writeln(stopWatch.peek().msecs);
}
}
Here is my output:
CPUs : 4
8061
2686
I get similar results whether I pass 100_000 to parallel() or not.
Ali
More information about the Digitalmars-d-learn
mailing list