[Issue 23049] [SIMD][CODEGEN] Wrong code for XMM.RCPSS after inlining

d-bugmail at puremagic.com d-bugmail at puremagic.com
Sun Apr 24 07:38:15 UTC 2022


https://issues.dlang.org/show_bug.cgi?id=23049

Walter Bright <bugzilla at digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla at digitalmars.com

--- Comment #1 from Walter Bright <bugzilla at digitalmars.com> ---
I finally figured out what was going on here. The code generated is:

    float4 A = [2.34f, -70000.0f, 0.00001f, 345.5f];
                movaps  XMM0,FLAT:.rodata[00h][RIP]
                movaps  -020h[RBP],XMM0

    float4 R = cast(float4) __simd(XMM.RCPSS, A);
                rcpss   XMM1,-020h[RBP]      (*)
                movaps  -010h[RBP],XMM1

    assert(R.array[1] == -70000.0f)
                movss   XMM2,-0Ch[RBP]
                ...

(*) rcpss stores a value into the lower 4 bytes of XMM1, leaving the rest of
XMM1 unchanged. But, according to the compiler, the entirety of XMM1 was
changed by the assignment, even though it wasn't. Hence, the upper 12 bytes of
XMM1 are garbage.

You can make it work by explicitly passing the implicit argument:

    float4 R = A;
    R = cast(float4) __simd(XMM.RCPSS, R, A);

--


More information about the Digitalmars-d-bugs mailing list