Arrays: Core or Library Type?
Fawzi Mohamed
fmohamed at mac.com
Thu May 29 08:17:30 PDT 2008
On 2008-05-29 03:13:39 +0200, Walter Bright <newshound1 at digitalmars.com> said:
> http://reddit.com/info/6l6i4/comments/
>
> http://dobbscodetalk.com/index.php?option=com_myblog&show=Arrays-Core-or-Library-Type-.html&Itemid=29
a
>
little OT but I am implementing multidimensional dense rectangular
arrays in D and they are coming out quite well.
Obviously they are Library arrays and in (due to effort/usefulness
ratio) I have done only arrays with dynamically chosen size (but
compile time rank).
There is an overhead in this choice, (and for example this arrays are
not suited for 3 (or 4) elements vectors, but an arrays 3D vectors...
works nicely.
For me the performance was important, and so I defined looping
constructs, that could also be parallelized with multiple threads in
the future.
Writing this in D 1.0 I really noted that D 2.0 offers definite
improvements that make me want to move to it: being able to define a
member function outside a class (future), structure destructurs, and a
very nice example of why invariant is useful.
I tested the performance with a sort of dot product on 2D slices of 3D
arrays (a very though test as you do just a multiply add per element),
and using the looping templates I get a performance close to the one of
the optimal loop with pointers, and to the fortran version.
Writing the loop externally with explicit indexing:
for (int i=0;i<ndim;i++)
for (int j=0;j<ndim;j++)
for (int k=0;k<ndim;k++)
res[i]+=d1[i,j,k]*d2[0,j,k];
(that a fortran compiler optimizes close to the optimal one) is 8-9
times slower.
This because even if the indexing operation
double opIndex(int i,int j, int k){
int i1,i2;
i1=startIdx+i*strides[0];
i2=i1+j*strides[1];
return m_data[i2+k*strides[2]];
}
is inlined the assignements i1 and i2 are not floated out of the inner loop.
If strides and startIdx are declared invariant then the compiler knows
that floating them to the outer loops is safe and might do it, bringing
the time down to 2-3 times the optimal code.
Replacing multiplications with additions in the indexes and using
pointers instead of array indexing bring it close to the optimal code.
(
Actually in D 1.0 using just single variables and not an array (I
wasn't able to use a const array) and declaring them const should
allow the compiler to do these optimizations, but at least gdc does not
do it, the code gets ugly, and the templatized version of it is
actually slower than the non const version... but invariant is really
what you want.
)
So with these things I think that an array in D 2.0 can come close the
core one and this a testament on how good D is.
This does not contradict the fact that I think that a core array (at
least for 1D) is a *very* nice thing to have.
Fawzi
More information about the Digitalmars-d
mailing list