X[]==Y[] is 7X slower than it should be -- why?
Don
nospam at nospam.com.au
Fri Jun 20 08:02:25 PDT 2008
I've been doing some timing tests, and found that the built-in array
comparison is amazingly slow.
Given:
uint [2000] X, Y;
bool b = (X[] == Y[]);
on my Pentium M, this takes ~30000 cycles in the case where both arrays
are equal.
It's not hard to write an optimal asm routine; this takes 4000 cycles
(7.5 times faster).
But, it's also possible to perform the comparison in simple D code,
something like:
int compareArrays(uint [] x, uint y[])
{
for (int i=0; i<x.length; ++i) {
if (x[i] != y[i]) return x[i]-y[i];
}
return 0;
}
and this takes 6000 cycles when compiled with -release -O, 5X better
than the built-in version.
So, why is the built-in compare so slow? Using the CPU performance counters:
X[]==Y[] --> 10000 branches, 35000 memory loads, 30000 cycles.
for loop --> 4000 branches, 6000 memory loads, 6000 cycles.
asm --> 4000 branches, 4000 memory loads, 4000 cycles.
What's it doing? Is it doing a byte-by-byte comparison? (it shouldn't do
that even for strings). Is it doing bounds checking for every array
element?? (even with -release -O)
There's significant optimisation potential here.
More information about the Digitalmars-d
mailing list