performance issues with SIMD function

Guillaume Piolat first.name at gmail.com
Sat Nov 4 14:20:40 UTC 2023


On Friday, 3 November 2023 at 15:11:31 UTC, Bogdan wrote:
> Can anyone help me to understand what I am missing?
>

Your loop is likely dominated by sin() calls, And the rest of the 
loop isn't complicated enough to outperform the compiler.

What you could do is use the intrinsics to implement a _mm_sin_ps 
that makes 4x sines at once, then you'll see an improvement at 
scale.


More information about the Digitalmars-d-learn mailing list