Performance issue with @fastmath and vectorization

deXtoRious via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Sat Nov 12 02:20:23 PST 2016


On Saturday, 12 November 2016 at 03:30:47 UTC, rikki cattermole 
wrote:
> Just a thought but try this:
>
> void compute_neq(float[] neq,
>                  const float[] ux,
>                  const float[] uy,
>                  const float[] rho,
>                  const float[] ex,
>                  const float[] ey,
>                  const float[] w,
>                  const size_t N) @fastmath {
>     foreach(idx; 0 .. N*N) {
>         float usqr = ux[idx] * ux[idx] + uy[idx] * uy[idx];
>
>         foreach(q; 0 .. 9) {
>             float eu = 3.0f * (ex[q] * ux[idx] + ey[q] * 
> uy[idx]);
>             float tmp = 1.0f + eu + 0.5f * eu * eu - 1.5f * 
> usqr;
>             tmp *= w[q] * rho[idx];
>             neq[idx * 9 + q] = tmp;
>         }
>     }
> }
>
> It may not make any difference since it is semantically the 
> same but I thought at the very least rewriting it to be a bit 
> more idiomatic may help.

That's how I originally wrote the code, then reverted to the 
C++-style for the comparison to make the code as identical as 
possible and make sure it doesn't make any difference. As 
expected, it doesn't.


More information about the digitalmars-d-ldc mailing list