parallel is slower than serial

Yura yuriy.min at gmail.com
Tue Oct 18 16:58:51 UTC 2022


Thank you, folks, for your hints and suggestions!

Indeed, I re-wrote the code and got it substantially faster and 
well paralleled.

Insted of making inner loop parallel, I made parallel both of 
them. For that I had to convert 2d index into 1d, and then back 
to 2d. Essentially I had to calculate each element Aij of the 
matrix, and then I put everything to 1d array.

And yes, A = A ~ Aij was very slow, to avoid it I had to use 2d 
-> 1d mapping. I will check your solution as well as I like it 
too.

The more I use the D Language, the more I like it.

On Tuesday, 18 October 2022 at 16:07:22 UTC, Siarhei Siamashka 
wrote:
> On Tuesday, 18 October 2022 at 11:56:30 UTC, Yura wrote:
>> ```D
>> // Then for each Sphere, i.e. dot[i]
>> // I need to do some arithmetics with itself and other dots
>> // I have only parallelized the inner loop, i is fixed.
>
> It's usually a much better idea to parallelize the outer loop. 
> Even OpenMP tutorials explain this: 
> https://ppc.cs.aalto.fi/ch3/nested/ (check the "collapse it 
> into one loop" suggestion from it).
>
>> ```D
>> for (auto j=0;j<Ai.length;j++) {
>>   A = A ~ Ai[j];
>> }
>> ```
>
> This way of appending to an array is very slow and `A ~= 
> Ai[j];` is much faster. And even better would be `A ~= Ai;` 
> instead of the whole loop.




More information about the Digitalmars-d-learn mailing list