[Issue 23046] [REG][CODEGEN] __simd(XMM.LODLPS) bad codegen
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Apr 25 21:55:20 UTC 2022
https://issues.dlang.org/show_bug.cgi?id=23046
Walter Bright <bugzilla at digitalmars.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bugzilla at digitalmars.com
--- Comment #1 from Walter Bright <bugzilla at digitalmars.com> ---
Here's the problem. You've specified the LODLPS instruction (actually MOVLPS):
https://www.felixcloutier.com/x86/movlps
But what was generated was the MOVHLPS instruction:
https://www.felixcloutier.com/x86/movhlps
They both have the same opcode: 0F 12. The two are distinguished by the second
operand. A 64 bit second operand selects MOVLPS, a 128 bit operand selects
MOVHLPS. The code:
__simd(XMM.LODLPS, a, *cast(const(__m128)*)mem_addr)
selects MOVHLPS. However, changing it to:
__simd_sto(XMM.LODLPS, a, *cast(const(long)*)mem_addr) [1]
doesn't work because core.simd doesn't have that overload. Hence, the PR to add
it to core.simd, and then with the change[1] the example works.
In general, when working with SIMD instructions that change only parts of a
register, it merits close attention to the instruction that is actually
generated.
--
More information about the Digitalmars-d-bugs
mailing list