toy windowing auto-vec miss

Bruce Carneal bcarneal at gmail.com
Mon Nov 7 14:59:42 UTC 2022


On Monday, 7 November 2022 at 01:59:03 UTC, Bruce Carneal wrote:
> Here's a simple godbolt example of one of the areas in which 
> gdc solidly outperforms ldc wrt auto-vectorization: simple but 
> not trivial operand gather
> https://godbolt.org/z/ox1vvxd8s
>
>
> Compile time target adaptive manual __vector-ization is an 
> answer here if you have no access to SIMT, so not a show 
> stopper, but the code is less readable.
>
> I'm not sure what the data parallel future should look like wrt 
> language/IR but I'm pretty sure we can do better than praying 
> that the auto vectorizer can dig patterns out of for loops, or 
> throwing ourselves on the manual vectorization grenade, 
> repeatedly.

My "grenade" phrasing above was fun to write but overly dramatic. 
  Manual __vector-ization is more tedious than dangerous and D 
ldc/gdc give you quite a bit of help there including 1) __vector 
types 2) CT max vector length introspection.

Also, auto vectorization *does* work nicely against simple/and-or 
conditioned inputs/outputs.

I believe there is a lot more to be had in the 
programmer-friendly-data-parallelism department, perhaps 
involving a (major) pivot to MLIR, but I give my considered 
thanks to those involved in providing what is already the best 
option in that arena from my point of view.  Introspection, 
__vector, auto-vec, dcompute, ... it's a potent toolkit.




More information about the digitalmars-d-ldc mailing list