x86 intrinsics for sale cheap

claptrap clap at trap.com
Wed May 31 23:18:44 UTC 2023


On Wednesday, 31 May 2023 at 17:09:38 UTC, Cecil Ward wrote:
> On Wednesday, 31 May 2023 at 16:51:42 UTC, Cecil Ward wrote:
>> On Wednesday, 31 May 2023 at 16:45:35 UTC, max haughton wrote:
>>> On Wednesday, 31 May 2023 at 16:33:47 UTC, Cecil Ward wrote:
>>
>
> Ah, just followed that link. No that’s (solely?) SIMD, 
> something I was aware of and so I’m not duplicating that as I 
> haven’t gone near SIMD. The pext instruction would be one 
> instruction that I attacked some time ago, and that would 
> already be fine with ARM as there’s a pure D fallback, but 
> maybe I can find some native ARM equivalent if I study AArch64.
>
> So no, this would be something new. Non-SIMD insns for general 
> use. The smallest instructions might be something like andn if 
> I can keep to zero-overhead obviously, seeing as the benefit in 
> the instruction is so tiny anyway. But mind you I could have 
> done with it for graphics bit twiddling manipulation code.

If you tell LDC the right cpu target, and to use optimization, 
IE..

"-O -mcpu=haswell"

It will use the andn instruction...

uint foo(uint a, uint b)
{
     return a & (b ^ 0xFFFFFFFF);
}

compiles to ---->

uint example.foo(uint, uint):
         andn    eax, edi, esi
         ret

So you will probably find the compiler is already doing what you 
want if you let it know it can target the right cpu architechre.

I've been writing asm for over 30 years, the opportunities for 
beating modern compilers have gotten vanishingly small for pretty 
much everything except for SIMD code. And tbh the differences 
between CPUs, ie different instruction latency on different 
architectures, means it's pretty much pointless to chance few 
percent here or there, since there's a good chance it'll be a few 
percent the other way on a different CPU.





More information about the Digitalmars-d mailing list