Rather Bizarre slow downs using Complex!float with avx (ldc).
Johan
j at j.nl
Thu Sep 30 16:52:57 UTC 2021
On Thursday, 30 September 2021 at 16:40:03 UTC, james.p.leblanc
wrote:
> D-Ers,
>
> I have been getting counterintuitive results on avx/no-avx
> timing
> experiments.
This could be an template instantiation culling problem. If the
compiler is able to determine that `Complex!float` is already
instantiated (codegen) inside Phobos, then it may decide not to
codegen it again when you are compiling your code with
AVX+fastmath enabled. This could explain why you don't see
improvement for `Complex!float`, but do see improvement with
`Complex!double`. This does not explain the worse performance
with AVX+fastmath vs without it.
Generally, for performance issues like this you need to study
assembly output (`--output-s`) or LLVM IR (`--output-ll`).
First thing I would look out for is function inlining yes/no.
cheers,
Johan
More information about the Digitalmars-d-learn
mailing list