How to tune numerical D? (matrix multiplication is faster in g++ vs gdc)

jerro a at a.com
Mon Mar 4 07:57:41 PST 2013


On Monday, 4 March 2013 at 15:46:50 UTC, jerro wrote:
>> A bit better version:
>> http://codepad.org/jhbYxEgU
>>
>> I think this code is good compared to the original (there are 
>> better algorithms).
>
> You can make it much faster even without really changing the 
> algorithm. Just by reversing the order of inner two loops like 
> this:
>
> void matrixMult2(in int[][] m1, in int[][] m2, int[][] m3) pure 
> nothrow {
>     foreach (immutable i; 0 .. m1.length)
>         foreach (immutable k; 0 .. m2[0].length)
>             foreach (immutable j; 0 .. m3[0].length)
>                 m3[i][j] += m1[i][k] * m2[k][j];
> }
>
> you can make the code much more cache friendly (because now you 
> aren't iterating any matrix by column in the inner loop) and 
> also allow the compiler to do auto vectorization. matrixMul2() 
> takes 2.6 seconds on my machine and matrixMul()takes 72 seconds 
> (both compiled with  gdmd -O -inline -release -noboundscheck 
> -mavx).
>
> This isn't really relevant to the comparison with C++ in this 
> thread, I just thought it may be useful for anyone writing 
> matrix code.

forgot to set m3's elements to zero before adding to them:

void matrixMult2(in int[][] m1, in int[][] m2, int[][] m3) pure 
nothrow {
     foreach (immutable i; 0 .. m1.length)
     {
         m3[i][] = 0;

         foreach (immutable k; 0 .. m2[0].length)
             foreach (immutable j; 0 .. m3[0].length)
                 m3[i][j] += m1[i][k] * m2[k][j];
     }
}

This does not make the function noticeably slower.


More information about the Digitalmars-d mailing list