Trying to understand multidimensional arrays in D

Wed Jan 25 19:02:32 PST 2017

On Thursday, January 26, 2017 01:47:53 Profile Anaysis via Digitalmars-d-
learn wrote:
> I'm a bit confused by how D does arrays.
>
> I would like to create a array of matrices but I do not seem to
> get the correct behavior:
>
>      int[][4][4] matrix_history;

Like in C/C++, types are mostly read outward from the variable name in D. In
both C/C++ and D,

int* foo;

is a pointer to an int. It's read outward from the variable name, so you get
the pointer and then what it points to. Similarly,

int** foo;

is a pointer to a pointer to an int. In C/C++, a static array would be
written like

int arr[4][3];

and again, it's read outward from the type. It's a static array of 4 static
arrays of 3 ints. This gets increasingly confusing as the types get more
complicated, but it's critical for understanding how function pointers are
written in C/C++. For D, it's a lot less critical, because we have a cleaner
function pointer declaration syntax, but the same basic rules mostly apply
(const, immutable, and shared is where they start breaking the rules a bit,
but they're pretty straightforward and consistent with just pointers and
arrays). That

int arr[4][3];

in C/C++ can then be accessed like so

arr[3][2] = 5;

and that would get the 3rd element in the first array and the second element
in the second array without exceeding the bounds of the array. D follows the
same declaration rules except that it has the array bounds all on the
left-hand side of the variable. So, in C/C++, you have

int arr[4][3];

whereas in D, the same array would be

int[3][4] arr;

and you would still access it like so

arr[3][2] = 5;

without exceeding the bounds of the array, whereas

arr[2][3] = 5;

_would_ exceed the bounds of the array, because the second array has 3
elements in it, and that asks for the 4th.

This tends to be very confusing at first, because most folks usually expect
the indices to always in the same order, when they're not. They are so long
as the sizes is always on the right-hand side, which occurs with dynamic
arrays, but in D, static arrays go on the left. C/C++ would have the exact
same ordering problem as D if it put the sizes on the left, because it uses
basically the same rules for how types are written. But they put it on the
right, separating from the type, which makes the indices clearer but splits
the type in two. So, both approaches have their pros and cons.

In any case, the idea that the type is read outward from the variable name
extends to types in general. In particular, if you have

int[][4][4] arr;

as in your example, you have a static array of 4 static arrays of 4 dynamic
arrays of int. You can append to the innermost static array

arr[3][3] ~= [1, 2, 3];

but you can't append to arr. If you want a dynamic array of static arrays,
then you need to do

int[4][4][] arr;

Then you can append a 4x4 static array to arr. However, your attempts at
creating a static array were not actually creating static arrays.

auto arr = new int[4];

and

auto arr = new int[](4);

both allocate a dynamic array of length 4. The code semantics are identical.
However, once we go beyond one dimension, it starts mattering - and getting
confusing. Take this

auto arr = new int[][](4, 5);
static assert(is(typeof(arr) == int[][]));
assert(arr.length == 4);

arr is a dynamic array of length 4 that contains dynamic arrays of length 5
of int. This on the other hand

auto arr = new int[4][](5);
static assert(is(typeof(arr) == int[4][]));
assert(arr.length == 5);

makes it so that arr is dynamic array of length 5 that contains static
arrays of length 4 of int.

auto arr = new int[4][5];
static assert(is(typeof(arr) == int[4][]));
assert(arr.length == 5);

has the exact same semantics. So, the right-most number is always the length
of the outer, dynamic array, and whether the interior is more dynamic arrays
or static arrays depends on whether the numbers are between the brackets or
the parens.

Another thing to note is that when you have int[][], it is a dynamic array,
whereas int[4][4] is a static array. So, whenever you see the compiler give
you the type int[][], it's talking about a dynamic array, not a static
array. The numbers have to be there for it to be a static array. When
looking at the type of an array (as opposed to a expression using new),
numbers between the subscripts mean a static array, whereas a lack of
numbers means a dynamic array, and the type of a dynamic array does not
change depending on its length.

Also, even if you had declared matrix_history correctly

int[4][4][] matrix_history;

this code would be wrong

> matrix_history ~= new int[][](4,4);

because int[][](4, 4) is allocating a dynamic array of dynamic arrays of
ints, not a static array of 4 static arrays of 4 ints.

In addition, AFAIK, you can't just new up a static array of 4 static arrays
of int. You can new up dynamic arrays but not static arrays. The static
arrays need to be in something to be on the heap. But that's not really what
you wanted anyway. Take a simpler example.

int[] arr;
arr ~= 5;
arr ~= 42;

Note that you're not newing up the 5 or the 42. If you were newing up the
ints, you'd actually have

int*[] arr;

So, with your matrix_history, when you append a static array to it, you're
just appending a value - either a variable or an array literal. e.g.

int[4][4] sa;
matrix_history ~= sa;

or

matrix_history ~= [[1, 2, 3, 4], [5, 6, 7, 8], [8, 7, 6, 5], [4, 3, 2, 1]];

Well, hopefully that's not too much information at once, and hopefully it
helps. But I'd suggest reading

https://dlang.org/spec/arrays.html
http://ddili.org/ders/d.en/arrays.html
http://dlang.org/d-array-article.html

to try and better understand arrays in general. The last one applies more to
dynamic arrays, but depending on your current understanding, it could really
help you figure out what's going on (be warned though that it does not use
the official terminology; per the language spec, int[] is a dynamic array no
matter what memory it points to, and there is no special term for the
GC-allocated buffer that the dynamic array points to when you use new int[],
whereas that article refers to int[] as a slice and the GC-allocated buffer
that you get with new int[] as the dynamic array; however, aside from that
problem with the terminology, it's a fantastic article and should be quite
enlightening).

- Jonathan M Davis