auto vectorization of interleaves
Bruce Carneal
bcarneal at gmail.com
Mon Jan 10 20:45:28 UTC 2022
On Monday, 10 January 2022 at 19:21:06 UTC, Johan wrote:
> On Monday, 10 January 2022 at 03:04:22 UTC, Bruce Carneal wrote:
>> ...
>
> The compiler complains about aliasing when optimizing.
> https://d.godbolt.org/z/hnGj3G3zo
>
> For example, the write to `dst[0]` may alias with `s1[i]` so
> `s1[i]` needs to be reloaded. I think the problem gets worse
> with 16bit numbers because they may partially overlap? (8bits
> of dst[0] overlap with s1[i]) Just a guess of why the lookup
> tables `.LCPI0_x` are generated...
>
> -Johan
Thanks for the clarification Johan. For some reason I was
thinking FORTRAN rather than C wrt aliasing assumptions. LDC's
@restrict attribute looks useful.
That said, for my current work the performance predictability of
explicit __vector forms has come to outweigh the convenience of
auto-vectorization. The unittesting is more laborious but the
performance is easier to understand.
Thanks again for your answers and patience. I'm pretty sure I'll
be using @restrict in other projects.
More information about the digitalmars-d-ldc
mailing list