Slow performance compared to C++, ideas?

Dicebot m.strashun at gmail.com
Fri May 31 06:05:00 PDT 2013


On Friday, 31 May 2013 at 11:49:05 UTC, Manu wrote:
> I find that using templates actually makes it more likely for 
> the compiler
> to properly inline. But I think the totally generic expressions 
> produce
> cases where the compiler is considering too many possibilities 
> that inhibit
> many optimisations.
> It might also be that the optimisations get a lot more complex 
> when the
> code fragments span across a complex call tree with optimisation
> dependencies on non-deterministic inlining.
>
> One of the most important jobs for the optimiser is code 
> re-ordering.
> Generic code is often written in such a way that makes it 
> hard/impossible
> for the optimiser to reorder the flattened code properly.
> Hand written code can have branches and memory accesses 
> carefully placed at
> the appropriate locations.
> Generic code will usually package those sorts of operations 
> behind little
> templates that often flatten out in a different order.
> The optimiser is rarely able to re-order code across if 
> statements, or
> pointer accesses. __restrict is very important in generic code 
> to allow the
> optimiser to reorder across any indirection, otherwise 
> compilers typically
> have to be conservative and presume that something somewhere 
> may have
> changed the destination of a pointer, and leave the order as 
> the template
> expanded. Sadly, D doesn't even support __restrict, and nobody 
> ever uses it
> in C++ anyway.
>
> I've always has better results with writing precisely what I 
> intend the
> compiler to do, and using __forceinline where it needs a little 
> extra
> encouragement.

Thanks for valuable input. Have never had a pleasure to actually 
try templates in performance-critical code and this a good stuff 
to remember about. Have added to notes.


More information about the Digitalmars-d mailing list