How to deal with inline asm functions in Phobos/druntime?
    David Nadlinger via digitalmars-d-ldc 
    digitalmars-d-ldc at puremagic.com
       
    Wed Apr  8 06:28:04 PDT 2015
    
    
  
On 8 Apr 2015, at 15:15, Daniel Murphy via digitalmars-d-ldc wrote:
> I don't think it's so much about vectorizing as it is about avoiding 
> the x87 FPU, which you can do when 80-bit precision is not needed.
Indeed. On x86_64, the SSE registers (%xmm0 and so on) are used by 
default for single- and double-precision floating point operations. The 
x87 FPU is not particularly well-optimized on newer CPUs to begin with, 
and transferring data from the SSE registers to the FPU on function 
entry and then back again is quite costly too.
For example, this is what made us (all D compilers) look bad on that 
Perlin noise microbenchmark (the thread from a couple of months ago).
  — David
    
    
More information about the digitalmars-d-ldc
mailing list