512 bit static array to vector
kinke
noone at nowhere.com
Sun Jun 19 16:56:17 UTC 2022
On Sunday, 19 June 2022 at 14:13:45 UTC, Bruce Carneal wrote:
> 1) LDC requires more instructions at 512 bits. At 256
> (x86-64-v3) they're the same.
Different results (actually using a 512-bit move, not 2x256) with
`-mattr=avx512bw`. I guess LLVM makes performance assumptions for
the provided CPU and prefers 256-bit instructions.
The biggest difference with gdc is an ABI difference - gdc
returning the vector directly in an AVX512 register, whereas LDC
returns it indirectly (sret return - caller passes a pointer to
its pre-allocated result). That's a limitation of the frontend's
https://github.com/dlang/dmd/blob/master/src/dmd/argtypes_sysv_x64.d, which supports 256-bit vectors but no 512-bit ones (the SysV ABI keeps getting extended for broader vectors...).
> 3) LDC fabricates non-HW __vectors so is(someVector) has
> diminished CT utility.
I consider that useful, e.g., allowing to use a `double4` without
having to consider CPU limitations. - I think the compiler should
expose a trait for the largest supported vector size instead.
More information about the digitalmars-d-ldc
mailing list