How to deal with inline asm functions in Phobos/druntime?

David Nadlinger via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Wed Apr 8 06:28:04 PDT 2015


On 8 Apr 2015, at 15:15, Daniel Murphy via digitalmars-d-ldc wrote:
> I don't think it's so much about vectorizing as it is about avoiding 
> the x87 FPU, which you can do when 80-bit precision is not needed.

Indeed. On x86_64, the SSE registers (%xmm0 and so on) are used by 
default for single- and double-precision floating point operations. The 
x87 FPU is not particularly well-optimized on newer CPUs to begin with, 
and transferring data from the SSE registers to the FPU on function 
entry and then back again is quite costly too.

For example, this is what made us (all D compilers) look bad on that 
Perlin noise microbenchmark (the thread from a couple of months ago).

  — David


More information about the digitalmars-d-ldc mailing list