[Issue 19663] New: On x86_64 the fabs intrinsic should use SSE
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Sat Feb 9 15:00:57 UTC 2019
https://issues.dlang.org/show_bug.cgi?id=19663
Issue ID: 19663
Summary: On x86_64 the fabs intrinsic should use SSE
Product: D
Version: D2
Hardware: x86_64
OS: All
Status: NEW
Keywords: performance
Severity: enhancement
Priority: P1
Component: dmd
Assignee: nobody at puremagic.com
Reporter: b2.temp at gmx.com
Currently on x86_64 dmd backend uses the FPU FABS homonymous instruction but
since `single` and `double` parameters are passed, as defined by ABI, in SSE
registers, the they have to travel from these SSE registers to GP registers
then only to FPU registers and depending on what's done with the absolute value
that's obtained: back to a GP register (and all of this to clear a bit !), then
again back to SSE register if the func has to return the value etc.
It would be more wise to use SSE logical AND with a mask.
This would be done only for the single and double types.
Several options exist
1. generate mask and ANDPS/ANDPD
2. ANDPS/ANDPD on a constant mask (LDC2 does that btw)
3. left shift and right shift by one
Forum discussion:
https://forum.dlang.org/post/diljelbvmenuxtaqbuxw@forum.dlang.org
Reference for the possible solutions:
https://stackoverflow.com/questions/32408665/fastest-way-to-compute-absolute-value-using-sse
--
More information about the Digitalmars-d-bugs
mailing list