x86 intrinsics for sale cheap
Cecil Ward
cecil at cecilward.com
Wed May 31 18:04:22 UTC 2023
On Wednesday, 31 May 2023 at 17:44:21 UTC, Richard (Rikki) Andrew
Cattermole wrote:
> A concern here is that inline assembly is unlikely (if at all)
> to inline.
>
> So you're going to have to be pretty careful that what you do
> is actually worth the function call, because if it isn't simd,
> it just might not be doing enough work to justify using inline
> assembly.
>
> If you are able to get a backend to generate the instruction
> you want using regular D code, then you're good to go. As
> that'll inline.
>
> My general recommendation here is to not worry about specific
> instructions unless you really _really_ need to (very tiny
> percentage of code fits this, almost to the point of not being
> worth considering).
>
> Instead focus on making your D code communicate to the backend
> what you intend. Even if it doesn't do the job today, in 2
> years time it could generate significantly better assembly.
Understood and agreed. I’m able to get functions to inline with
no problems with GDC when there is inline-asm code in them. As
you say, without that, the overhead of a call can wipe out all of
the benefit and it’s pointless. I’ve written test functions that
call the instruction and it all inlines perfectly with no problem
interfacing register usage in a very flexible manner thanks to
GCC/GDC’s superb design. And LDC would perhaps be even better
were it not for the inline-asm syntax wishlist-item mentioned
earlier that means that the current LDC would require me to
rewrite all the asm to not use _named_ parameters within the asm
body itself. Something I’d love to fix myself within LDC, but I
don’t remotely have the knowledge of compiler internals nor the
general expertise.
As for worrying about individual instructions, that isn’t my
goal, it’s just both a learning exercise and possibly to make the
instructions available to anyone who decides that they want them,
and they are assumed to have enough experience to make that
decision based on performance, but I will give them a
zero-overhead solution (unless D prevents me from doing so)
More information about the Digitalmars-d
mailing list