parallel is slower than serial
Yura
yuriy.min at gmail.com
Tue Oct 18 16:58:51 UTC 2022
Thank you, folks, for your hints and suggestions!
Indeed, I re-wrote the code and got it substantially faster and
well paralleled.
Insted of making inner loop parallel, I made parallel both of
them. For that I had to convert 2d index into 1d, and then back
to 2d. Essentially I had to calculate each element Aij of the
matrix, and then I put everything to 1d array.
And yes, A = A ~ Aij was very slow, to avoid it I had to use 2d
-> 1d mapping. I will check your solution as well as I like it
too.
The more I use the D Language, the more I like it.
On Tuesday, 18 October 2022 at 16:07:22 UTC, Siarhei Siamashka
wrote:
> On Tuesday, 18 October 2022 at 11:56:30 UTC, Yura wrote:
>> ```D
>> // Then for each Sphere, i.e. dot[i]
>> // I need to do some arithmetics with itself and other dots
>> // I have only parallelized the inner loop, i is fixed.
>
> It's usually a much better idea to parallelize the outer loop.
> Even OpenMP tutorials explain this:
> https://ppc.cs.aalto.fi/ch3/nested/ (check the "collapse it
> into one loop" suggestion from it).
>
>> ```D
>> for (auto j=0;j<Ai.length;j++) {
>> A = A ~ Ai[j];
>> }
>> ```
>
> This way of appending to an array is very slow and `A ~=
> Ai[j];` is much faster. And even better would be `A ~= Ai;`
> instead of the whole loop.
More information about the Digitalmars-d-learn
mailing list