Trivial benchmarking on linux
Georg Wrede
georg.wrede at iki.fi
Mon Mar 9 11:28:06 PDT 2009
Inspired by recent benchmarking posts, I decided to do a little, too.
I decided to compare looping up and looping down.
void main(char[][] args)
{
auto count = to!(long)(args[1]);
for(long i = 0; i < count; i++) { /* do nothing */ }
}
I wanted to know if it makes a difference if the loop counts backwards,
so the other program had the following line instead:
for(long i = count; i > 0; --i)
So I compiled:
$ dmd loop.d
$ dmd loopv.d
Here I stored the program name for later use before running the
benchmark (real handy because you probably end up with several versions
of your program):
$ p=loop
$ rm -f $p.bench;for a in {1..30} ; do
> (time $p 100000000) 2>> $p.bench ; done
To test the other program, I changed p to the other program's name, and
then I simply pressed up-arrow so I got back the long command doing the
benchmarking.
To see the best result of the 30 test runs, I wrote:
$ grep real loop.bench | sort | head -1
real 0m0.337s
$ grep real loopv.bench | sort | head -1
real 0m0.316s
As I expected, counting backwards was faster, but not as much as I
expected. I also did the same benchmark but with the for-loop counting
10x longer, and got similar results.
Then I got curious as to what the difference between these programs
really was, and decided to take a look:
$ objdump -d loop.o > loop.asm
$ objdump -d loopv.o > loopv.asm
$ diff loop.asm loopv.asm
35,54c35,48
< 29: 89 45 f0 mov %eax,-0x10(%ebp)
< 2c: 89 55 f4 mov %edx,-0xc(%ebp)
< 2f: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%ebp)
< 36: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
< 3d: 8b 55 fc mov -0x4(%ebp),%edx
< 40: 8b 45 f8 mov -0x8(%ebp),%eax
< 43: 3b 55 f4 cmp -0xc(%ebp),%edx
< 46: 7f 11 jg 59 <_Dmain+0x59>
< 48: 7c 05 jl 4f <_Dmain+0x4f>
< 4a: 3b 45 f0 cmp -0x10(%ebp),%eax
< 4d: 73 0a jae 59 <_Dmain+0x59>
< 4f: 83 45 f8 01 addl $0x1,-0x8(%ebp)
< 53: 83 55 fc 00 adcl $0x0,-0x4(%ebp)
< 57: eb e4 jmp 3d <_Dmain+0x3d>
< 59: 31 c0 xor %eax,%eax
< 5b: c9 leave
< 5c: c3 ret
< 5d: 90 nop
< 5e: 90 nop
< 5f: 90 nop
---
> 29: 89 45 f8 mov %eax,-0x8(%ebp)
> 2c: 89 55 fc mov %edx,-0x4(%ebp)
> 2f: 83 7d fc 00 cmpl $0x0,-0x4(%ebp)
> 33: 7c 12 jl 47 <_Dmain+0x47>
> 35: 7f 06 jg 3d <_Dmain+0x3d>
> 37: 83 7d f8 00 cmpl $0x0,-0x8(%ebp)
> 3b: 76 0a jbe 47 <_Dmain+0x47>
> 3d: 83 6d f8 01 subl $0x1,-0x8(%ebp)
> 41: 83 5d fc 00 sbbl $0x0,-0x4(%ebp)
> 45: eb e8 jmp 2f <_Dmain+0x2f>
> 47: 31 c0 xor %eax,%eax
> 49: c9 leave
> 4a: c3 ret
> 4b: 90 nop
One sees that they are quite different. (I bet Walter has some
interesting commentary on this.)
Concluding remarks:
It is almost trivial to do such benchmarking on linux. Also, changing a
program in just one place keeps the diff output small enough so that one
can easily see what actually changes in the compiled program.
The program `objdump' is standard on linux. (One can also use the D
utilities, but I happened to use objdump here.)
More information about the Digitalmars-d
mailing list