Basic benchmark

Sun Dec 14 18:04:46 PST 2008

Jarrett Billingsley wrote:
> I hope bearophile will eventually understand that DMD is not good at
> optimizing code, and so comparing its output to GCC's is ultimately
> meaningless.

The long arithmetic benchmark is completely (and I mean completely) 
dominated by the time spent in the long divide helper function. The 
timing results for it really have nothing to do with the compiler 
optimizer or code generator. Reducing the number of instructions in the 
loop by one or improving pairing slightly does nothing when stacked up 
against maybe 50 instructions in the long divide helper function.

The long divide helper dmd uses (phobos\internal\llmath.d) is code I 
basically wrote 25 years ago and have hardly looked at since except to 
carry it forward. It uses the classic shift-and-subtract algorithm, but 
there are better ways to do it now with the x86 instruction set.

Time to have some fun doing hand-coded assembler again!

Fixing this should bring that loop timing up to par, but it's still not 
a good benchmark for a code generator. Coming up with good *code 
generator* benchmarks is hard, and really can't be done without looking 
at the assembler output to make sure that what you think is happening is 
what is actually happening.

I've seen a lot of benchmarks over the years, and too many of them do 
things like measure malloc() or printf() speed instead of loop 
optimizations or other intended measurements. Caching and alignment 
issues can also dominate the results.

I haven't looked closely at the other loop yet.