ldc nvvm GPU intrinsics good news

Bruce Carneal bcarneal at gmail.com
Fri Mar 5 00:03:26 UTC 2021


After updating the first line to 
'@compute(CompileFor.hostAndDevice) module ...' and adding an 
'import ldc.dcompute;' line, the 
runtime/import/ldc/gccbuiltins_nvvm.di file from a current LDC 
build apparently gives access to all manner of GPU intrinsics.

I've only tried it out on __syncthreads and __nvvm_shfl_down_i32 
but both "invocations" of those LDC_intrinsics resulted in the 
expected single ptx instruction in the dcompute .ptx output.

There are over 600 pragma(LDC_intrinsic, "llvm.nvvm.xxxxxx") 
builtins in the gcc_builtins_nvvm.di file so, of course, I've not 
hand tested them all but it looks very promising.

If you're working with dcompute on an OpenCL device, I'd love to 
hear if something similar works for your use cases of if you've 
found another way forward.



More information about the digitalmars-d-ldc mailing list