D speed compared to C++

Tue Mar 18 17:45:00 PDT 2008

1) Using a float or double as an incrementor in a tight loop is a bad idea.  Most compilers optimize it out where possible; and so do Agner Fog and Paul Hseih.  They know why such is true better than I.

2) Most compilers optimize stuff out if it's not directly affecting output or external functions or arguments.  This is usually done on a per-function level.  A better optimizer would do it for the whole program.

3) Startup for D is slower even for hello world because D statically links the entirety of phobos and the GC even if you don't ever use them.  This equates to about 80kb of bloat - so it's still dramatically better than Java or C#, but still not "correct".

4) If the GC does a collection cycle, it'll bump the time complexity.  This will happen pseudo-randomly.

~~~

If you really want to improve performance on C or C++, do it by profiling your program, and optimize parts where it matters how fast you go.

- simplify
- remove unnecessary loops
- hoist stuff out of loops as much as possible
- iterate or recurse in ways that ease cache miss penalties
- iterate instead of recurse as much as possible
- reduce if/else if/else || && as much as possible
- multiply by inverse instead of divide where possible
- reduce calls to the OS where it's sensible

If you need to go further, learn assembler.  D's inline one ain't half bad.  You can do things in assembler that you can't do in HLL's.  Things like ror, rcl, sete, cmovcc, prefetchntq, clever XMMX usage and such.

/me is looking forward to when XMMX has byte-array functionality.   Could outperform all x86-32 string stuff by an order of magnitude.