Rather Bizarre slow downs using Complex!float with avx (ldc).

Johan j at j.nl
Thu Sep 30 16:52:57 UTC 2021


On Thursday, 30 September 2021 at 16:40:03 UTC, james.p.leblanc 
wrote:
> D-Ers,
>
> I have been getting counterintuitive results on avx/no-avx 
> timing
> experiments.

This could be an template instantiation culling problem. If the 
compiler is able to determine that `Complex!float` is already 
instantiated (codegen) inside Phobos, then it may decide not to 
codegen it again when you are compiling your code with 
AVX+fastmath enabled. This could explain why you don't see 
improvement for `Complex!float`, but do see improvement with 
`Complex!double`. This does not explain the worse performance 
with AVX+fastmath vs without it.

Generally, for performance issues like this you need to study 
assembly output (`--output-s`) or LLVM IR (`--output-ll`).
First thing I would look out for is function inlining yes/no.

cheers,
   Johan



More information about the Digitalmars-d-learn mailing list