x86 intrinsics for sale cheap
Guillaume Piolat
first.last at spam.org
Thu Jun 1 09:30:02 UTC 2023
On Wednesday, 31 May 2023 at 17:44:21 UTC, Richard (Rikki) Andrew
Cattermole wrote:
>
> Instead focus on making your D code communicate to the backend
> what you intend. Even if it doesn't do the job today, in 2
> years time it could generate significantly better assembly.
For LDC the least performance regression usually comes from any
form of LDC's __ir_pure, however it becomes slower to compile on
large projects (up to 50ms, which is the cost of a 1500x1500 JPEG
decoding ;) ).
https://github.com/ldc-developers/ldc/issues/4388
As a reminder of what intel-intrinsics does:
- implement the semantics of the Intel intrinsics, up to AVX
(AVX2 is WIP)
- on DMD x86/x86_64 + GDC x86_64 + LDC x86/x86_64/arm64/arm32
- supporting a fallback for everything, even the SSE4.1 string
instructions and rounding modes
Interestingly if you use AVX intrinsics even without the AVX
instructions enabled, you might sometimes be able to get speedup
thanks to the implicit loop unrolling.
More information about the Digitalmars-d
mailing list