Slow performance compared to C++, ideas?
Dicebot
m.strashun at gmail.com
Fri May 31 06:05:00 PDT 2013
On Friday, 31 May 2013 at 11:49:05 UTC, Manu wrote:
> I find that using templates actually makes it more likely for
> the compiler
> to properly inline. But I think the totally generic expressions
> produce
> cases where the compiler is considering too many possibilities
> that inhibit
> many optimisations.
> It might also be that the optimisations get a lot more complex
> when the
> code fragments span across a complex call tree with optimisation
> dependencies on non-deterministic inlining.
>
> One of the most important jobs for the optimiser is code
> re-ordering.
> Generic code is often written in such a way that makes it
> hard/impossible
> for the optimiser to reorder the flattened code properly.
> Hand written code can have branches and memory accesses
> carefully placed at
> the appropriate locations.
> Generic code will usually package those sorts of operations
> behind little
> templates that often flatten out in a different order.
> The optimiser is rarely able to re-order code across if
> statements, or
> pointer accesses. __restrict is very important in generic code
> to allow the
> optimiser to reorder across any indirection, otherwise
> compilers typically
> have to be conservative and presume that something somewhere
> may have
> changed the destination of a pointer, and leave the order as
> the template
> expanded. Sadly, D doesn't even support __restrict, and nobody
> ever uses it
> in C++ anyway.
>
> I've always has better results with writing precisely what I
> intend the
> compiler to do, and using __forceinline where it needs a little
> extra
> encouragement.
Thanks for valuable input. Have never had a pleasure to actually
try templates in performance-critical code and this a good stuff
to remember about. Have added to notes.
More information about the Digitalmars-d
mailing list