Flat multi-dim arrays.

Thu Aug 10 11:19:10 PDT 2006

Dave wrote:
> 
> What, not even a "that sucks!" from someone <g>
> 
> I'm pretty sure the syntax has been proposed before, but not the same 
> implementation ideas.
> 
> I for one would like syntax like this. I like it w/ C#, and it seems a 
> relatively low hanging and juicy fruit (but Walter would have to tell us 
> how low hanging it is for sure, of course).

Have you read Norbert Nemec's original multi-dim array proposal (you can 
find it on Wiki4D)? The key feature is "strided slices", which seem to 
be essential for dealing with multi-dim arrays. Something like that 
could get most of the fruit on the tree... Norbert's idea is not quite 
well developed enough yet, though, to be truly compelling. I think 
Walter isn't interested in being as good or better than C/C++/C#, he 
wants to Get It Right. If you can develop an idea for strided arrays 
which folds neatly into the existing language, that would be extremely 
valuable.

> 
> Thanks,
> 
> - Dave
> 
> Dave wrote:
>>
>> A current thread delved into "deep" copy for multidim arrays, but 
>> maybe a different approach to MD
>> arrays that would a) make things easier to .dup, b) more efficient and 
>> c) perhaps appeal to the
>> numerical computing folks as well?
>>
>> int[,] arr = new int[100,100];
>> int[,,] arr = new int[50,10,20];
>> ...
>>
>> - They would be allocated from a single contiguous chunk of memory for 
>> speed and ease of .dup'ing
>> and array operations:
>>
>> int[,] ar2=arr.dup; // int[][] arra2 = 
>> alloc2D!(int)(arr.length,arr[0].length);
>>                     // memcpy(ar2[0].ptr,arr[0].ptr,arr.length * 
>> arr[0].length * int.sizeof);
>>
>> ar2[] = 10;         // _memset32(ar2[0].ptr,10,ar2.length * 
>> ar2[0].length);
>>
>> foreach(i; ar2) {}  // foreach(i; cast(int[])ar2[0].ptr[0 .. 
>> ar2.length * ar2[0].length]) {}
>>
>> (the array of T are allocated from the same contiguous pool of memory).
>>
>> - The compiler frontend would convert operands like arr[10,10] into 
>> arr[10][10] to use the already
>> optimized compiler backend code for jagged array access (and array 
>> bounds checking). They would also
>> be passed into and out of functions as arr[][]. Also consistent with 
>> the opIndex and opIndexAssign
>> overload syntax, so the new syntax doesn't create inconsistencies with 
>> UDT op overloading (actually
>> seems more consistent because there isn't a way to overload [][]). 
>> Fortran programmers may be more
>> comfortable with the [x,y] syntax too (?)
>>
>> - I tested this with the linpack benchmark and the performance doubled 
>> on my machine for a 500 x 500
>> matrix by just changing the array allocation from jagged to one flat / 
>> contiguous chunk of memory.
>> For smaller arrays the performance is the same or perhaps a little 
>> better, especially for native
>> types > 32 bit (longs, doubles, reals).
>>
>> - Perhaps leave stuff like array slicing out, at least to start with. 
>> But considering that a simple
>> conversion to native jagged arrays is taking place, this shouldn't be 
>> really difficult to implement
>> either(?).
>>
>> - No new operators or keywords. These would be a natural extension to 
>> D's improved built-in arrays,
>> especially because the memory management is easily handled by the GC.
>>
>> - Maximum rank of 7 for compatibility w/ Fortran.
>>
>> - Would be another reason to switch from C and/or C++ for things like 
>> numerical analysis.
>>
>> Thoughts?
>>
>> - Dave