Inherent code performance advantages of D over C?
ponce
contact at g3mesfrommars.fr
Sun Dec 8 04:35:44 PST 2013
I work all day with C++ optimization and deal closely with the
Intel compiler, here is what I have to say. I agree with all
points but I think 1, 3 and 7 are slightly innacurate.
> 1. D knows when data is immutable. C has to always make worst
> case assumptions, and assume indirectly accessed data mutates.
ICC (and other C++ compilers) has plenty of way to disambiguate
aliasing:
- a pragma to let the optimizer assume no loop dependency
- restrict keyword
- /Qalias-const: assumes a parameter of type pointer-to-const
does not alias with a parameter of type pointer-to-non-const.
- GCC-like strict aliasing rule
In most case I've seen, the "no loop dependency" pragma is
downright spectacular and gives the most bang for the bucks.
Every other methods is annoying and barely useful in comparison.
It's not clear to me which aliasing rules D assume.
> 3. Function inlining has generally been shown to be of
> tremendous value in optimization. D has access to all the
> source code in the program, or at least as much as you're
> willing to show it, and can inline across modules. C cannot
> inline functions unless they appear in the same module or in .h
> files. It's a rare practice to push many functions into .h
> files. Of course, there are now linkers that can do whole
> program optimization for C, but those are kind of herculean
> efforts to work around that C limitation of being able to see
> only one module at a time.
This point is not entirely accurate. While the C model is
generally harmful with inlining, with the Intel C++ compiler you
can absolutely rely on cross-module inlining when doing global
optimization. I don't know how it works, but all out tiny
functions hidden in separate translation units get inlined.
ICC also provide 4 very useful pragmas for optimization:
{forcing|not forcing} inlining [recursively] at call-point,
instead of definition point. I find them better than any
inline/__force_inline at definition point.
> 7. D's "final switch" enables more efficient switch code
> generation, because the default doesn't have to be considered.
A good point.
The default: branch can be marked unreachable with most C++
compilers I know of. People don't do it though.
In my experience, ICC performs sufficient static analysis to be
able to avoid the switch prelude test. I don't like it, since it
is not desirable for reliable optimization.
Would be amazing to have the ICC backend work with a D front-end
:)
It kicked my ass so many times.
More information about the Digitalmars-d
mailing list