[Issue 23046] [REG][CODEGEN] __simd(XMM.LODLPS) bad codegen

d-bugmail at puremagic.com d-bugmail at puremagic.com
Mon Apr 25 21:55:20 UTC 2022


https://issues.dlang.org/show_bug.cgi?id=23046

Walter Bright <bugzilla at digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla at digitalmars.com

--- Comment #1 from Walter Bright <bugzilla at digitalmars.com> ---
Here's the problem. You've specified the LODLPS instruction (actually MOVLPS):

https://www.felixcloutier.com/x86/movlps

But what was generated was the MOVHLPS instruction:

https://www.felixcloutier.com/x86/movhlps

They both have the same opcode: 0F 12. The two are distinguished by the second
operand. A 64 bit second operand selects MOVLPS, a 128 bit operand selects
MOVHLPS. The code:

    __simd(XMM.LODLPS, a, *cast(const(__m128)*)mem_addr)

selects MOVHLPS. However, changing it to:

    __simd_sto(XMM.LODLPS, a, *cast(const(long)*)mem_addr)   [1]

doesn't work because core.simd doesn't have that overload. Hence, the PR to add
it to core.simd, and then with the change[1] the example works.

In general, when working with SIMD instructions that change only parts of a
register, it merits close attention to the instruction that is actually
generated.

--


More information about the Digitalmars-d-bugs mailing list