[Issue 19443] core.simd generates incorrect code for MOVHLPS
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Sun Mar 21 06:44:15 UTC 2021
https://issues.dlang.org/show_bug.cgi?id=19443
--- Comment #5 from Walter Bright <bugzilla at digitalmars.com> ---
The MOVHLPS instruction is encoded:
NP 0F 12 /r MOVHLPS xmm1, xmm2
"Moves two packed single-precision floating-point values from the high quadword
of the second XMM argument (second operand) to the low quadword of the first
XMM register (first argument). The quadword at bits 127:64 of the destination
operand is left unchanged. Bits (MAXVL-1:128) of the corresponding destination
register remain unchanged."
The MOVLPS instruction is encoded:
NP 0F 12 /r MOVLPS xmm1, m64
"Moves two packed single-precision floating-point values from the source 64-bit
memory operand and stores them in the low 64-bits of the destination XMM
register. The upper 64bits of the XMM register are preserved. Bits
(MAXVL-1:128) of the corresponding destination register are preserved."
https://www.felixcloutier.com/x86/movlps
https://www.felixcloutier.com/x86/movhlps
Looking at the code:
float4 a = [1.0f, 2.0f, 3.0f, 4.0f];
float4 b = [5.0f, 6.0f, 7.0f, 8.0f];
float4 r = cast(float4) __simd(XMM.MOVHLPS, a, b);
float[4] correct = [7.0f, 8.0f, 3.0f, 4.0f];
assert(r.array == correct); // FAIL, produces [5, 6, 3, 4] instead
The problem appears to be that the second operand needs to be forced into an
XMM register rather than remaining in memory.
--
More information about the Digitalmars-d-bugs
mailing list