x86 intrinsics for sale cheap

Guillaume Piolat first.last at spam.org
Thu Jun 1 09:30:02 UTC 2023


On Wednesday, 31 May 2023 at 17:44:21 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
>
> Instead focus on making your D code communicate to the backend 
> what you intend. Even if it doesn't do the job today, in 2 
> years time it could generate significantly better assembly.

For LDC the least performance regression usually comes from any 
form of LDC's __ir_pure, however it becomes slower to compile on 
large projects (up to 50ms, which is the cost of a 1500x1500 JPEG 
decoding ;) ).
https://github.com/ldc-developers/ldc/issues/4388

As a reminder of what intel-intrinsics does:
   - implement the semantics of the Intel intrinsics, up to AVX 
(AVX2 is WIP)
   - on DMD x86/x86_64 + GDC x86_64 + LDC x86/x86_64/arm64/arm32
   - supporting a fallback for everything, even the SSE4.1 string 
instructions and rounding modes

Interestingly if you use AVX intrinsics even without the AVX 
instructions enabled, you might sometimes be able to get speedup 
thanks to the implicit loop unrolling.



More information about the Digitalmars-d mailing list