toy windowing auto-vec miss

Johan j at j.nl
Mon Nov 7 23:25:12 UTC 2022


On Monday, 7 November 2022 at 18:14:44 UTC, Bruce Carneal wrote:
> On Monday, 7 November 2022 at 16:49:24 UTC, Johan wrote:
>> On Monday, 7 November 2022 at 01:59:03 UTC, Bruce Carneal 
>> wrote:
>>> Here's a simple godbolt example of one of the areas in which 
>>> gdc solidly outperforms ldc wrt auto-vectorization: simple 
>>> but not trivial operand gather
>>> https://godbolt.org/z/ox1vvxd8s
>>
>> Don't have time to dive deeper but I found that:
>> Removing `@restrict` results in vectorized instructions with 
>> LDC (don't know if it is faster, just that they appear in ASM).
>>
>> -Johan
>
> That's very interesting.
>
> This is the first time I've heard of @restrict making things 
> worse wrt auto vectorization. From what I've seen in other 
> experiments, @restrict provides a minor benefit (code size 
> reduction) frequently while occasionally enabling vectorization 
> of otherwise complex dependency graphs.

Yeah, this is an LLVM bug.

If you're interested in digging around a bit further, you can 
look at how the individual optimization passes change the IR code:
https://godbolt.org/z/e9nqPfeKn

Loop vectorization pass does nothing for the `@restrict` case. 
Note that the input for that pass is slightly different: the 
`@restrict` case has a more complex forbody.preheader and 3 phi 
nodes in the for body (compared to 1 in the non-restrict case)

-Johan





More information about the digitalmars-d-ldc mailing list