Improving dot product for standard multidimensional D arrays
p.shkadzko
p.shkadzko at gmail.com
Sun Mar 1 20:58:42 UTC 2020
Hello again,
Thanks to previous thread on multidimensional arrays, I managed
to play around with pure D matrix representations and even
benchmark a little against numpy:
+-------------------------------------------------------------+------------+-----------+
| benchmark |
time (sec) | vs Numpy |
+-------------------------------------------------------------+------------+-----------+
| Sum of two [5000, 6000] int array of arrays |
~0.28 | x4.5 |
| Multiplication of two [5000, 6000] double array of arrays |
~0.3 | x2.6 |
| Sum of two [5000, 6000] int struct matrices |
~0.039 | x0.6 |
| Multiplication of two [5000, 6000] double struct matrices |
~0.135 | x1.2 |
| L2 norm of [5000, 6000] double struct matrix |
~0.015 | x15 |
| Sort of [5000, 6000] double struct matrix (axis=-1) |
~2.435 | x1.9 |
| Dot product of [500, 600]&[600, 500] double struct matrices |
~0.172 | -- |
+-------------------------------------------------------------+------------+-----------+
However, there is one benchmark I am trying to make at least a
little comparable. That is the dot product of two struct
matrices. Concretely [A x B] @ [B x A] = [A, A] operation.
There is a dotProduct function in std.numeric but it works with
1D ranges only.
After it was clear that array of arrays are not very good to
represent multidimensional data, I used struct to represent a
multidimensional arrays like so:
******************************************************************
struct Matrix(T)
{
T[] data; // keep our data as 1D array and reshape to 2D when
needed
int rows;
int cols;
// allow Matrix[] instead of Matrix.data[]
alias data this;
this(int rows, int cols)
{
this.data = new T[rows * cols];
this.rows = rows;
this.cols = cols;
}
this(int rows, int cols, T[] data)
{
assert(data.length == rows * cols);
this.data = data;
this.rows = rows;
this.cols = cols;
}
T[][] to2D()
{
return this.data.chunks(this.cols).array;
}
/// Allow element 2D indexing, e.g. Matrix[row, col]
T opIndex(in int r, in int c)
{
return this.data[toIdx(this, r, c)];
}
}
pragma(inline) static int toIdx(T)(Matrix!T m, in int i, in int j)
{
return m.cols * i + j;
}
******************************************************************
And here is the dot product function:
******************************************************************
Matrix!T matrixDotProduct(T)(Matrix!T m1, Matrix!T m2)
in
{
assert(m1.rows == m2.cols);
}
do
{
Matrix!T m3 = Matrix!T(m1.rows, m2.cols);
for (int i; i < m1.rows; ++i)
{
for (int j; j < m2.cols; ++j)
{
for (int k; k < m2.rows; ++k)
{
m3.data[toIdx(m3, i, j)] += m1[i, k] * m2[k, j];
}
}
}
return m3;
}
******************************************************************
However, attempting to run dotProduct on two 5000x6000 struct
Matrices took ~20 min while 500x600 only 0.172 sec. And I
wondered if there is something really wrong with the
matrixDotProduct function.
I can see that accessing the appropriate array member in
Matrix.data is costly due to toIdx operation but, I can hardly
explain why it gets so much costly. Maybe there is a better way
to do it after all?
More information about the Digitalmars-d-learn
mailing list