512 bit static array to vector

kinke noone at nowhere.com
Sun Jun 19 16:56:17 UTC 2022


On Sunday, 19 June 2022 at 14:13:45 UTC, Bruce Carneal wrote:
> 1) LDC requires more instructions at 512 bits. At 256 
> (x86-64-v3) they're the same.

Different results (actually using a 512-bit move, not 2x256) with 
`-mattr=avx512bw`. I guess LLVM makes performance assumptions for 
the provided CPU and prefers 256-bit instructions.

The biggest difference with gdc is an ABI difference - gdc 
returning the vector directly in an AVX512 register, whereas LDC 
returns it indirectly (sret return - caller passes a pointer to 
its pre-allocated result). That's a limitation of the frontend's 
https://github.com/dlang/dmd/blob/master/src/dmd/argtypes_sysv_x64.d, which supports 256-bit vectors but no 512-bit ones (the SysV ABI keeps getting extended for broader vectors...).

> 3) LDC fabricates non-HW __vectors so is(someVector) has 
> diminished CT utility.

I consider that useful, e.g., allowing to use a `double4` without 
having to consider CPU limitations. - I think the compiler should 
expose a trait for the largest supported vector size instead.


More information about the digitalmars-d-ldc mailing list