Compiler optimizations (I'm baffled)
Bruno Medeiros
brunodomedeirosATgmail at SPAM.com
Thu May 4 04:23:03 PDT 2006
Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Bruno Medeiros schrieb am 2006-05-03:
>> Walter Bright wrote:
>>> Craig Black wrote:
>>>> This is
>>>> because integer division is essentially floating point division under the
>>>> hood.
>> I ran these tests and I got basicly the same results (the int division
>> is slower). I am very intrigued and confused. Can you (or someone else)
>> explain briefly why this is so?
>> One would think it would be the other way around (float being slower) or
>> at least the same speed.
>
>
> The code doesn't necessarily show that int division is slower than float
> multiplication.
>
> What CPU are we talking about?
>
> A naive interpretation of the "benchmark" assumes a single execution
> pipe that does floating point and integer operations in sequence ...
>
> Even assuming a single pipe: Why is the SSE version faster?
>
> Does the benchmark measure the speed of int division against float
> multiplication?
>
> Does the benchmark measure the throughput of int division against float
> multiplication?
>
> Does the benchmark measure the throughput of int division of a set of
> numbers through a constant factor against float multiplication of the
> same set through (1 / constant factor)?
>
> Thomas
>
>
>
> -----BEGIN PGP SIGNATURE-----
>
> iD8DBQFEWRDO3w+/yD4P9tIRAs8lAJ9q62J8zf8U0HWzxtxQmMWasuU4ngCgwA21
> 4M5nb9Z8ZXHevJiwylY/wGM=
> =QSyS
> -----END PGP SIGNATURE-----
Hum, yes I should have been more specific. I only ran (a modified
version of) the latest test, which measured the throughput of int
division against double division (I hope...).
Let me just put the code:
#include <stdio.h>
#include <time.h>
//typedef double divtype;
typedef int divtype;
int main()
{
clock_t start = clock();
divtype result = 0;
divtype div=1;
for(int max = 100000000; div < max; div++)
{
result = (42 / div);
}
clock_t finish = clock();
double duration = (double)(finish - start) / CLOCKS_PER_SEC;
printf("[%f] %2.2f seconds\n", double(result),duration);
}
------------------------------------
I ran the tests with GCC, with both -O0 and -O2, on an Athlon XP, and it
both cases the typedef double divtype version was about twice as fast.
The assembly code I get for line 17 is the following:
*** INT:
.stabn 68,0,17,LM6-_main
LM6:
movl $42, %edx
movl %edx, %eax
sarl $31, %edx
idivl -12(%ebp)
movl %eax, -8(%ebp)
*** DOUBLE:
.stabn 68,0,17,LM6-_main
LM6:
flds LC0
fdivs -12(%ebp)
fstps -8(%ebp)
I have little idea what it is that it's doing.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
More information about the Digitalmars-d
mailing list