GPGPU progess
Nicholas Wilson via Digitalmars-d
digitalmars-d at puremagic.com
Thu May 18 03:26:13 PDT 2017
On Thursday, 18 May 2017 at 09:07:38 UTC, Nicholas Wilson wrote:
> When ldc runs you will get a kernels_cudaxxx_yy.ptx (where xxx
> is the CUDA compute capability specified on the command line
> and yy is 32 or 64 for 32 or 64bit) which should fit somewhere
> into your existing C++ pipeline.
Whoops, that assumes you have a CUDA driver API pipeline in your
C++ code, which if you're asking I'm not sure that you have.
If you're using the `kernel<<<...>>>(args)` form to launch you
kernels then you are going to have a lot more work to do in D
because you'll need to use the driver API
(http://docs.nvidia.com/cuda/cuda-driver-api/#axzz4hQLA0Zdm)
You'll need to:
*get a device
*create a context from it
*get a stream on that context
*load the ptx module (possibly linking it with other modules, to
resolve missing symbols).
*compile it for the device
*then launch a kernel from that module on that device, by name
passing the arguments in a void*[].
The sad thing is that its still nice than OpenCL because in
OpenCL you have to pass the runtime args (with sizes) one by one
to a function.
Hence why I want to automate as much of that shit as is possible.
I hope to have that done ASAP, but I don't have hardware set up
to test CUDA at the moment (I have one on my windows box but I
don't have dev set up there) and I'll be working on OpenCL at the
same time (and theres only so much horrible API I can take in a
day).
I'll be working on dcompute part-part-time next semester though
so I should be able to get a fair bit done and quite a few others
are interested so that'll speed thing up a bit.
More information about the Digitalmars-d
mailing list