Compiler optimizations (I'm baffled)

Bruno Medeiros brunodomedeirosATgmail at SPAM.com
Thu May 4 04:23:03 PDT 2006


Thomas Kuehne wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Bruno Medeiros schrieb am 2006-05-03:
>> Walter Bright wrote:
>>> Craig Black wrote:
>>>>  This is
>>>> because integer division is essentially floating point division under the
>>>> hood.
>> I ran these tests and I got basicly the same results (the int division 
>> is slower). I am very intrigued and confused. Can you (or someone else) 
>> explain briefly why this is so?
>> One would think it would be the other way around (float being slower) or 
>> at least the same speed.
> 
> 
> The code doesn't necessarily show that int division is slower than float
> multiplication.
> 
> What CPU are we talking about?
> 
> A naive interpretation of the "benchmark" assumes a single execution
> pipe that does floating point and integer operations in sequence ...
> 
> Even assuming a single pipe: Why is the SSE version faster?
> 
> Does the benchmark measure the speed of int division against float
> multiplication? 
> 
> Does the benchmark measure the throughput of int division against float
> multiplication? 
> 
> Does the benchmark measure the throughput of int division of a set of
> numbers through a constant factor against float multiplication of the
> same set through (1 / constant factor)?
> 
> Thomas
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> 
> iD8DBQFEWRDO3w+/yD4P9tIRAs8lAJ9q62J8zf8U0HWzxtxQmMWasuU4ngCgwA21
> 4M5nb9Z8ZXHevJiwylY/wGM=
> =QSyS
> -----END PGP SIGNATURE-----

Hum, yes I should have been more specific. I only ran (a modified 
version of) the latest test, which measured the throughput of int 
division against double division (I hope...).
Let me just put the code:

#include <stdio.h>
#include <time.h>

//typedef double divtype;
typedef int divtype;

int main()
{
    clock_t start = clock();


    divtype result = 0;
    divtype div=1;

    for(int max = 100000000; div < max; div++)
    {
      result = (42 / div);
    }


    clock_t finish = clock();
    double duration = (double)(finish - start) / CLOCKS_PER_SEC;
    printf("[%f] %2.2f seconds\n", double(result),duration);
}

------------------------------------
I ran the tests with GCC, with both -O0 and -O2, on an Athlon XP, and it 
both cases the typedef double divtype version was about twice as fast. 
The assembly code I get for line 17 is the following:

*** INT:

.stabn 68,0,17,LM6-_main
LM6:
	movl	$42, %edx
	movl	%edx, %eax
	sarl	$31, %edx
	idivl	-12(%ebp)
	movl	%eax, -8(%ebp)

*** DOUBLE:

.stabn 68,0,17,LM6-_main
LM6:
	flds	LC0
	fdivs	-12(%ebp)
	fstps	-8(%ebp)


I have little idea what it is that it's doing.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D



More information about the Digitalmars-d mailing list