toy windowing auto-vec miss
Johan
j at j.nl
Mon Nov 7 23:25:12 UTC 2022
On Monday, 7 November 2022 at 18:14:44 UTC, Bruce Carneal wrote:
> On Monday, 7 November 2022 at 16:49:24 UTC, Johan wrote:
>> On Monday, 7 November 2022 at 01:59:03 UTC, Bruce Carneal
>> wrote:
>>> Here's a simple godbolt example of one of the areas in which
>>> gdc solidly outperforms ldc wrt auto-vectorization: simple
>>> but not trivial operand gather
>>> https://godbolt.org/z/ox1vvxd8s
>>
>> Don't have time to dive deeper but I found that:
>> Removing `@restrict` results in vectorized instructions with
>> LDC (don't know if it is faster, just that they appear in ASM).
>>
>> -Johan
>
> That's very interesting.
>
> This is the first time I've heard of @restrict making things
> worse wrt auto vectorization. From what I've seen in other
> experiments, @restrict provides a minor benefit (code size
> reduction) frequently while occasionally enabling vectorization
> of otherwise complex dependency graphs.
Yeah, this is an LLVM bug.
If you're interested in digging around a bit further, you can
look at how the individual optimization passes change the IR code:
https://godbolt.org/z/e9nqPfeKn
Loop vectorization pass does nothing for the `@restrict` case.
Note that the input for that pass is slightly different: the
`@restrict` case has a more complex forbody.preheader and 3 phi
nodes in the for body (compared to 1 in the non-restrict case)
-Johan
More information about the digitalmars-d-ldc
mailing list