Has anyone used D with Nvidia's Cuda?

Dmitri Makarov via Digitalmars-d digitalmars-d at puremagic.com
Sat Apr 4 00:30:19 PDT 2015


The programmer describes the computations to be done on a device,
invokes the clop compiler via mixin expression passing the string
describing the computations in an OpenCL-like syntax. The compiler
returns D code that includes the generated OpenCL kernel and all the
boiler plate code. The computations can refer to variables declared in
the host application, CLOP will generate the necessary CL buffers and
kernel arguments. Here's an example:

// use CLOP DSL to generate OpenCL kernel and API calls.
mixin( compile(
q{
int max3( int a, int b, int c )
{
int k = a > b ? a : b;
return k > c ? k : c;
}
Antidiagonal NDRange( c : 1 .. cols, r : 1 .. rows ) {
F[c, r] = max3( F[c - 1, r - 1] + S[c + cols * r], F[c - 1, r] -
penalty, F[c, r - 1] - penalty );
} apply( rectangular_blocking( 8, 8 ) )
} ) );

This implements Needleman-Wunsch algorithm in CLOP. It says that the
computation to be done over 2D index space 1..cols by 1..rows. It
requires anti-diagonal synchronization pattern, meaning that the
elements on every anti-diagonal of the index space can be processed in
parallel, but there is global synchronization point between the
diagonals. Also the user requests to optimize this using rectangular
blocking. The variables: cols, rows, S, F, penalty are normal D
variables declared and defined in the application that contains the
above mixin statement.

 You can look at my github repository for more examples
https://github.com/dmakarov/clop but the project is in very early
stage and not yet usable.

Regards,

Dmitri


On Sat, Apr 4, 2015 at 9:00 AM, Vlad Levenfeld via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On Saturday, 4 April 2015 at 06:36:49 UTC, Dmitri Makarov wrote:
>>
>> On Saturday, 4 April 2015 at 02:49:16 UTC, Walter Bright wrote:
>>>
>>> http://www.nvidia.com/object/cuda_home_new.html
>>
>>
>> No, but I'm building an embedded dsl that will allow to generate
>> opencl kernels and supporting boilerplate opencl api calls at
>> compile-time. it's called clop (openCL OPtimizer). It uses
>> derelict.opencl bindings.
>
>
> How would it be used? At the client level, I mean.


More information about the Digitalmars-d mailing list