WTF! Parallel foreach more slower that normal foreach in multicore CPU ?
Robert Clipsham
robert at octarineparrot.com
Thu Jun 23 04:13:04 PDT 2011
On 23/06/2011 11:05, Zardoz wrote:
> I'm trying std.parallelism, and I made this code (based over foreach parallel example) :
> import std.stdio;
> import std.parallelism;
> import std.math;
> import std.c.time;
>
> void main () {
> auto logs = new double[20_000_000];
> const num = 10;
>
> clock_t clk;
> double norm;
> double par;
>
> writeln("CPUs : ",totalCPUs );
>
> clk = clock();
> foreach (t; 0..num) {
>
> foreach(i, ref elem; logs) {
> elem = log(i + 1.0);
> }
> }
> norm = clock() -clk;
>
> clk = clock();
> foreach (t; 0..num) {
>
> foreach(i, ref elem; taskPool.parallel(logs, 100)) {
> elem = log(i + 1.0);
> }
>
> }
> par = clock() -clk;
>
> norm = norm / num;
> par = par / num;
>
> writeln("Normal : ", norm / CLOCKS_PER_SEC, " Parallel : ", par / CLOCKS_PER_SEC);
> }
>
> I get this result :
>
> CPUs : 2
> Normal : 1.325 Parallel : 1.646
>
> And the result changes, every time that I run it, around +-100ms (I think that depends of how are CPUs busy in these moment)
>
> I played changin workUnitSize from 1 to 10000000 without any apreciable change....
> My computer it's a AMD Athlon 64 X2 Dual Core Processor 6000+ running over a kUbuntu 11.04 64bits with 2 GiB of ram. I compiled it with dmd 2.053
> htop shows that when test program are running parallel foreach, both cores are at ~98% of load and with normal foreach, only one core gets at ~99% of load.
The reason for this is your workload is very small - it's likely that
the overhead from context switching and spawning threads is greater than
the gain in performance from running in parallel. Using parallel() in
foreach will only be faster if you're doing something more expensive.
Also note that parallel() is an alias for taskPool.parallel(), saving
you a few characters :)
--
Robert
http://octarineparrot.com/
More information about the Digitalmars-d-learn
mailing list