foreach (i; taskPool.parallel(0..2_000_000)

Mon Apr 3 22:24:18 UTC 2023

On 4/3/23 6:02 PM, Paul wrote:
> On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote:
> 
>>
>> It's important to note that parallel doesn't iterate the range in 
>> parallel, it just runs the body in parallel limited by your CPU count.
> **?!?**

So for example, if you have:

```d
foreach(i; iota(0, 2_000_000).parallel)
{
    runExpensiveTask(i);
}
```

The foreach is run on the main thread, gets a `0`, then hands off to a 
task thread `runExpensiveTask(0)`. Then it gets a `1`, and hands off to 
a task thread `runExpensiveTask(1)`, etc. The iteration is not 
expensive, and is not done in parallel.

On the other hand, what you *shouldn't* do is:

```d
foreach(i; iota(0, 2_000_000).map!(x => runExpensiveTask(x)).parallel)
{
}
```

as this will run the expensive task *before* running any tasks.

> 
>> If your `foreach` body takes a global lock (like `writeln(i);`), then 
>> it's not going to run any faster (probably slower actually).
> **Ok I did have some debug writelns I commented out.**

And did it help? Another thing that takes a global lock is memory 
allocation.

>> Also make sure you have more than one logical CPU.
> **I have 8.**

It's dependent on the work being done, but you should see a roughly 8x 
speedup as long as the overhead of distributing tasks is not significant 
compared to the work being done.

-Steve