Any SIMD experts?

John Colvin via Digitalmars-d digitalmars-d at puremagic.com
Mon Dec 8 09:20:37 PST 2014


On Monday, 8 December 2014 at 17:05:09 UTC, John Colvin wrote:
> On Monday, 8 December 2014 at 16:32:50 UTC, Martin Nowak wrote:
>> I want to do bounds checking of 2 (4 on avx) ulongs (64-bit) 
>> at a time.
>>
>> ulong2 vval = [v0, v1];
>> ulong2 vlow = [low, low];
>> ulong2 vhigh = [high, high];
>>
>> int res = PMOVMSKB(vval >= vlow & vval < vhigh);
>>
>> I figured out sort of a solution, but it seems way too 
>> complicated, because there is only signed comparison.
>>
>> Usually (scalar) I'd use this, which makes use of unsigned 
>> wrap to safe one conditional
>>
>> immutable size = cast(ulong)(vhigh - vlow);
>> if (cast(ulong)(v0 - vlow) < size) {}
>> if (cast(ulong)(v1 - vlow) < size) {}
>>
>> over
>>
>> if (v0 >= vlow && v0 < vhigh) {}
>>
>> Maybe this can be used on SIMD too (saturated sub or so)?
>>
>> -Martin
>
> Well gcc gives me:
>
> typedef unsigned long ulong4 __attribute__ ((vector_size (32)));
>
> ulong4 foo(ulong4 a, ulong4 l, ulong4 h)
> {
>     return (a >= l) & (a < h);
> }
>
>
> foo(unsigned long __vector, unsigned long __vector, unsigned 
> long __vector):
> 	vmovdqa	.LC0(%rip), %ymm3
> 	vpsubq	%ymm3, %ymm0, %ymm0
> 	vpsubq	%ymm3, %ymm2, %ymm2
> 	vpsubq	%ymm3, %ymm1, %ymm1
> 	vpcmpgtq	%ymm0, %ymm2, %ymm2
> 	vpcmpgtq	%ymm0, %ymm1, %ymm1
> 	vpandn	%ymm2, %ymm1, %ymm0
> 	ret
> .LC0:
> 	.quad	-9223372036854775808
> 	.quad	-9223372036854775808
> 	.quad	-9223372036854775808
> 	.quad	-9223372036854775808

To conceptually get what it's doing here, the trick is that it's 
offsetting the values so as to simulate unsigned comparisons 
using signed instructions.


More information about the Digitalmars-d mailing list