LCD inline assembly expressions

NaN divide at by.zero
Mon Dec 24 02:39:54 UTC 2018


On Monday, 24 December 2018 at 01:40:42 UTC, NaN wrote:
>
> I dont think there's anyway to get around the temporary copy, 
> since it depends on knowing if 'a' is ever use after its used 
> in the compare. And it doesn't seem like the optimiser can cull 
> it away in this case.

OK think I've figured it out...

__m128i _mm_cmpgt_epi32(__m128i a, __m128i b) {
   return __asm!__m128i("pcmpgtd $2,$0","=x,0,x",a,b);
}

Basically....

$0 is the return, the constraint '=x' means its the output and 
uses xmm register
$1 is 'a', the constraint '0', means this param uses same 
register as $0
$2 is 'b', the constrain 'x' means this uses an xmm register

It's also AT&T syntax so the operands are reversed to what Im 
used to, so...

Although $1 is not written in the asm expression it has been tied 
to $0 by the '0'  constraint. So as far as the compiler is 
concerned 'a' comes in on the same register as the output goes 
out in. By knowing this it can create a temporary copy of 'a' if 
it needs to avoid trashing 'a'.

I've done some tests and if you do...

r = _mm_cmpgt_epi32(a,b)

it only creates the temporary if you use 'a' again afterwards.

So its all working i think.



More information about the digitalmars-d-ldc mailing list