Communication between D and C with dynamic arrays

Sun Aug 10 08:42:20 PDT 2014

On Sunday, 10 August 2014 at 14:26:29 UTC, seany wrote:
> In D, arrays are dynamic. However, to the best of my knowledge, 
> in C, they are static.
>
> I am having difficulty in imagining how to send D arrays to a C 
> function.
>
> My first Idea was to make a pointer to the array. then find the 
> size of the array, which itself is an array, therefore take the 
> pointer to it, then get the dimension of the array, which is an 
> integer, and send the trio of two pointers and the dimesion to 
> the C code.
>

In D, there both dynamic and static arrays. The static ones are 
those whose length is known at compile time, i.e.

     int[10] my_numbers;

These are treated as value types, not references. When a function 
takes a static array, it receives a copy of the entire array, not 
a reference to it:

     void foo(int[4] data);

This has no direct equivalent in C.

Then there are dynamic arrays. Under the hood, a dynamic array 
with elements of type T is actually a struct (well, it behaves as 
if it is one):

     struct DynamicArray {
         size_t length;
         T* ptr;
     }

With T being the specific type. (The length and ptr members might 
be in the opposite order.) You can always access the length and 
pointer of both types of arrays:

     void main() {
         import std.stdio;
         int[2] static_array;
         int[]  dynamic_array;
         writeln(static_array.length);     // 2
         writeln(dynamic_array.length);    // 0
         dynamic_array ~= [1,2,3,4];       // append elements
         writeln(dynamic_array.length);    // 4
         // you can even set the length
         writeln(dynamic_array);           // [1, 2, 3, 4]
         dynamic_array.length = 2;
         writeln(dynamic_array);           // [1, 2]
         // growing fills the array with default value
         dynamic_array.length = 4;
         writeln(dynamic_array);           // [1, 2, 0, 0]

         // slicing
         dynamic_array = static_array[];   // dyn.a. now points to 
stat.a.
         // dynamic_array.length == static_array.length
         // dynamic_array.ptr    == static_array.ptr
     }

The details of arrays in D are described here:
http://dlang.org/arrays

As for communicating with C, it depends.

Some C functions only take a pointer to an array. They either 
already know the length from somewhere else, or they use a 
special value in the array as a signal that the array ends there. 
The latter is typically used for strings, which are defined to 
end with a 0 byte. For passing arrays to these C functions, you 
can use the `ptr` property; alternatively, you can take the 
address of the first element.

     extern(C) my_c_func(int* data);
     int[] a = [1,2,3,4,5];
     my_c_func(a.ptr);
     my_c_func(&(a[0]));

Of course, you may need to make sure that the array actually 
contains the magic value at its end. For strings, this can be 
done with std.string.toStringz:
http://dlang.org/phobos/std_string.html#.toStringz

In most cases, however, the C function also accepts the length of 
the array. For these, you need to use the `length` property:

     extern(C) my_other_c_func(int* data, size_t length);
     int[] b = [6,7,8,9];
     my_other_c_func(b.ptr, b.length);

> Now this arises two questions :
>
> 1. I assume that D allocates a particular space to the array, 
> say N elements, and up to these N elements, you can increment 
> the pointers, to jump to next array element.
>
> What happens if the N element space is used up? Does the 
> pointer incrementing method break down?

Yes, the array elements are stored next to each other in memory, 
without gaps. By default, D checks whether you are still inside 
the bounds of the array, and raises an error if you access an 
element outside of the bounds. But this only works if you access 
the array directly using an index, or slicing:

     int[] a = [10,11,12,13];
     writeln(a[0]);         // 10
     writeln(a[3]);         // 13
     //writeln(a[4]);       // ERROR
     // slicing: indices 1 (incl.) up to 3 (excl.)
     writeln(a[1 .. 3]);    // [11, 12]
     // length is represented by $
     writeln(a[2 .. $]);    // [12, 13]
     //writeln(a[0 .. 5]);  // ERROR

You mentioned incrementing a pointer. This does work, just like 
in C, but if you do this, you are responsible for checking the 
array bounds. The language cannot help you there. If you access 
something outside of the array, it's undefined behaviour:

     int[] a = [20,30,40,50];
     int* b = a.ptr;
     writeln(*b);           // 20
     writeln(b[0]);         // 20
     writeln(b[0 .. 4]);    // [20, 30, 40, 50]
     //writeln(b[0 .. $]);  // doesn't compiler, length is unknown
     b++;
     writeln(b[0]);         // 30
     //writeln(b[100]);     // ???
     // might print garbage, might crash, might eat your 
harddisk...

>
> 2. in D pointers are being converted to void * as I read in the 
> reply to another post of mine. I dont remember in my knowledge 
> of C, that they are accepted in C, are they? Do I have to meake 
> a type retrospection every time I get send something to C?
>
> Any help is appreciated.

I think this is a misunderstanding. If you take a pointer to a 
variable of type `T` (or to an element of an array of `T`s), the 
pointer's type is `T*`. This is just like in C. `void*` is a 
special pointer type that you can use if you don't already know 
the real type it is pointing to, or if it is supposed to be 
opaque. This also exists in C.

Which pointer type you need to use when calling a C function 
depends of course on that function: Just use whatever the 
function accepts.

>
> PS: is there a built in size operator for arrays in D?

For the `length` property (the number of elements), see above. 
There is also the `sizeof` property, which returns the actual 
size that a variable occupies in memory. But be careful: for 
dynamic arrays, `sizeof` returns either 8 or 16, depending on 
whether you are compiling 32bit or 64bit programs. This is 
because, as explained above, dynamic array variables are 
internally just a pair of length and pointer.

     int[] a;
     int[16] b;
     writeln(a.sizeof);    // 16
     writeln(b.sizeof);    // 64