Matrix mul
bearophile
bearophileHUGS at lycos.com
Sat Nov 22 15:29:30 PST 2008
Andrei Alexandrescu:
> My guess is that if you turn that off, the differences won't be as large
> (or even detectable for certain ranges of N).
The array bounds aren't controlled, the code is compiled with -O -release -inline.
Do you see array bound controls in the asm code at the bottom of my post?
> Probably blocking will bring even more mileage (but again that depends
> on N).
Yes, blocking may help. And using SSE instructions may help some more. The end result may be hundred or more times faster than the naive code in D :-)
Bye,
bearophile
More information about the Digitalmars-d
mailing list