How works internally ParallelForEach

jerro a at a.com
Sat Dec 1 08:42:50 PST 2012


On Saturday, 1 December 2012 at 12:51:27 UTC, thedeemon wrote:
> On Saturday, 1 December 2012 at 11:36:16 UTC, Zardoz wrote:
>
>> The prevois code should work better if i set "total" to be 
>> sahred and hope that D shared vars have nnow the internal 
>> barries working ,or I need to manually use semaphores ?
>
> Probably core.atomic is the way to go. Semaphore is an overkill.

The easiest and fastest way is probably using taskPool.reduce, 
like this:

auto total = taskPool.reduce!"a+b"(
     iota(10_000_000).map!(a => log(a + 1.0)));

writeln(total);

Functions in core.atomic use instructions with lock prefix and 
according to http://www.agner.org/optimize/instruction_tables.pdf 
that "typically costs more than a hundred clock cycles,", so 
calling them for every element will probably slow things down 
significantly. It's best to just avoid accessing same memory from 
multiple threads wherever possible.


More information about the Digitalmars-d-learn mailing list