[Issue 23049] [SIMD][CODEGEN] Wrong code for XMM.RCPSS after inlining
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Sun Apr 24 07:38:15 UTC 2022
https://issues.dlang.org/show_bug.cgi?id=23049
Walter Bright <bugzilla at digitalmars.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bugzilla at digitalmars.com
--- Comment #1 from Walter Bright <bugzilla at digitalmars.com> ---
I finally figured out what was going on here. The code generated is:
float4 A = [2.34f, -70000.0f, 0.00001f, 345.5f];
movaps XMM0,FLAT:.rodata[00h][RIP]
movaps -020h[RBP],XMM0
float4 R = cast(float4) __simd(XMM.RCPSS, A);
rcpss XMM1,-020h[RBP] (*)
movaps -010h[RBP],XMM1
assert(R.array[1] == -70000.0f)
movss XMM2,-0Ch[RBP]
...
(*) rcpss stores a value into the lower 4 bytes of XMM1, leaving the rest of
XMM1 unchanged. But, according to the compiler, the entirety of XMM1 was
changed by the assignment, even though it wasn't. Hence, the upper 12 bytes of
XMM1 are garbage.
You can make it work by explicitly passing the implicit argument:
float4 R = A;
R = cast(float4) __simd(XMM.RCPSS, R, A);
--
More information about the Digitalmars-d-bugs
mailing list