performance issues with SIMD function
Guillaume Piolat
first.name at gmail.com
Sat Nov 4 14:20:40 UTC 2023
On Friday, 3 November 2023 at 15:11:31 UTC, Bogdan wrote:
> Can anyone help me to understand what I am missing?
>
Your loop is likely dominated by sin() calls, And the rest of the
loop isn't complicated enough to outperform the compiler.
What you could do is use the intrinsics to implement a _mm_sin_ps
that makes 4x sines at once, then you'll see an improvement at
scale.
More information about the Digitalmars-d-learn
mailing list