Inherent code performance advantages of D over C?

ponce contact at g3mesfrommars.fr
Sun Dec 8 04:35:44 PST 2013


I work all day with C++ optimization and deal closely with the 
Intel compiler, here is what I have to say. I agree with all 
points but I think 1, 3 and 7 are slightly innacurate.

> 1. D knows when data is immutable. C has to always make worst 
> case assumptions, and assume indirectly accessed data mutates.

ICC (and other C++ compilers) has plenty of way to disambiguate 
aliasing:
- a pragma to let the optimizer assume no loop dependency
- restrict keyword
- /Qalias-const: assumes a parameter of type pointer-to-const 
does not alias with a parameter of type pointer-to-non-const.
- GCC-like strict aliasing rule

In most case I've seen, the "no loop dependency" pragma is 
downright spectacular and gives the most bang for the bucks. 
Every other methods is annoying and barely useful in comparison.

It's not clear to me which aliasing rules D assume.

> 3. Function inlining has generally been shown to be of 
> tremendous value in optimization. D has access to all the 
> source code in the program, or at least as much as you're 
> willing to show it, and can inline across modules. C cannot 
> inline functions unless they appear in the same module or in .h 
> files. It's a rare practice to push many functions into .h 
> files. Of course, there are now linkers that can do whole 
> program optimization for C, but those are kind of herculean 
> efforts to work around that C limitation of being able to see 
> only one module at a time.

This point is not entirely accurate. While the C model is 
generally harmful with inlining, with the Intel C++ compiler you 
can absolutely rely on cross-module inlining when doing global 
optimization. I don't know how it works, but all out tiny 
functions hidden in separate translation units get inlined.
ICC also provide 4 very useful pragmas for optimization: 
{forcing|not forcing} inlining [recursively] at call-point, 
instead of definition point. I find them better than any 
inline/__force_inline at definition point.

> 7. D's "final switch" enables more efficient switch code 
> generation, because the default doesn't have to be considered.

A good point.
The default: branch can be marked unreachable with most C++ 
compilers I know of. People don't do it though.
In my experience, ICC performs sufficient static analysis to be 
able to avoid the switch prelude test. I don't like it, since it 
is not desirable for reliable optimization.

Would be amazing to have the ICC backend work with a D front-end 
:)
It kicked my ass so many times.


More information about the Digitalmars-d mailing list