Matrix mul

bearophile bearophileHUGS at lycos.com
Sat Nov 22 15:29:30 PST 2008


Andrei Alexandrescu:
> My guess is that if you turn that off, the differences won't be as large
> (or even detectable for certain ranges of N).

The array bounds aren't controlled, the code is compiled with -O -release -inline.
Do you see array bound controls in the asm code at the bottom of my post?


> Probably blocking will bring even more mileage (but again that depends 
> on N).

Yes, blocking may help. And using SSE instructions may help some more. The end result may be hundred or more times faster than the naive code in D :-)

Bye,
bearophile



More information about the Digitalmars-d mailing list