SIMD under LDC
12345swordy via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Mon Sep 4 18:11:29 PDT 2017
On Monday, 4 September 2017 at 23:06:27 UTC, Nicholas Wilson
wrote:
> On Monday, 4 September 2017 at 20:39:11 UTC, Igor wrote:
>> I found that I can't use __simd function from core.simd under
>> LDC
>
> Correct LDC does not support the core.simd interface.
>
>> and that it has ldc.simd but I couldn't find how to implement
>> equivalent to this with it:
>>
>> ubyte16* masks = ...;
>> foreach (ref c; pixels) {
>> c = __simd(XMM.PSHUFB, c, *masks);
>> }
>>
>> I see it has shufflevector function but it only accepts
>> constant masks and I am using a variable one. Is this possible
>> under LDC?
>
> You have several options:
> * write a regular for loop and let LDC's optimiser take care of
> the rest.
>
> alias mask_t = ReturnType!(equalMask!ubyte16);
> pragma(LDC_intrinsic, "llvm.masked.load.v16i8.p0v16i8")
> ubyte16 llvm_masked_load(ubyte16* val,int align, mask_t
> mask, ubyte16 fallthru);
>
> ubyte16* masks = ...;
> foreach (ref c; pixels) {
> auto mask = equalMask!ubyte16(*masks, [-1,-1,-1, ...]);
> c = llvm_masked_load(&c,16,mask, [0,0,0,0 ... ]);
> }
>
> The second one might not work, because of type differences in
> llvm, but should serve as a guide to hacking the `cmpMask` IR
> code in ldc.simd to do what you want it to.
>
>> BTW. Shuffling channels within pixels using DMD simd is about
>> 5 times faster than with normal code on my machine :)
>
> Don't underestimate ldc's optimiser ;)
I seen cases where the compiler fail to optimized for smid.
More information about the Digitalmars-d-learn
mailing list