toy windowing auto-vec miss
Bruce Carneal
bcarneal at gmail.com
Mon Nov 7 14:59:42 UTC 2022
On Monday, 7 November 2022 at 01:59:03 UTC, Bruce Carneal wrote:
> Here's a simple godbolt example of one of the areas in which
> gdc solidly outperforms ldc wrt auto-vectorization: simple but
> not trivial operand gather
> https://godbolt.org/z/ox1vvxd8s
>
>
> Compile time target adaptive manual __vector-ization is an
> answer here if you have no access to SIMT, so not a show
> stopper, but the code is less readable.
>
> I'm not sure what the data parallel future should look like wrt
> language/IR but I'm pretty sure we can do better than praying
> that the auto vectorizer can dig patterns out of for loops, or
> throwing ourselves on the manual vectorization grenade,
> repeatedly.
My "grenade" phrasing above was fun to write but overly dramatic.
Manual __vector-ization is more tedious than dangerous and D
ldc/gdc give you quite a bit of help there including 1) __vector
types 2) CT max vector length introspection.
Also, auto vectorization *does* work nicely against simple/and-or
conditioned inputs/outputs.
I believe there is a lot more to be had in the
programmer-friendly-data-parallelism department, perhaps
involving a (major) pivot to MLIR, but I give my considered
thanks to those involved in providing what is already the best
option in that arena from my point of view. Introspection,
__vector, auto-vec, dcompute, ... it's a potent toolkit.
More information about the digitalmars-d-ldc
mailing list