How to deal with inline asm functions in Phobos/druntime?
David Nadlinger via digitalmars-d-ldc
digitalmars-d-ldc at puremagic.com
Wed Apr 8 06:28:04 PDT 2015
On 8 Apr 2015, at 15:15, Daniel Murphy via digitalmars-d-ldc wrote:
> I don't think it's so much about vectorizing as it is about avoiding
> the x87 FPU, which you can do when 80-bit precision is not needed.
Indeed. On x86_64, the SSE registers (%xmm0 and so on) are used by
default for single- and double-precision floating point operations. The
x87 FPU is not particularly well-optimized on newer CPUs to begin with,
and transferring data from the SSE registers to the FPU on function
entry and then back again is quite costly too.
For example, this is what made us (all D compilers) look bad on that
Perlin noise microbenchmark (the thread from a couple of months ago).
More information about the digitalmars-d-ldc