How to deal with inline asm functions in Phobos/druntime?

Wed Apr 8 07:31:57 PDT 2015

On Wednesday, 8 April 2015 at 13:28:16 UTC, David Nadlinger wrote:
> On 8 Apr 2015, at 15:15, Daniel Murphy via digitalmars-d-ldc 
> wrote:
>> I don't think it's so much about vectorizing as it is about 
>> avoiding the x87 FPU, which you can do when 80-bit precision 
>> is not needed.
>
> Indeed. On x86_64, the SSE registers (%xmm0 and so on) are used 
> by default for single- and double-precision floating point 
> operations. The x87 FPU is not particularly well-optimized on 
> newer CPUs to begin with, and transferring data from the SSE 
> registers to the FPU on function entry and then back again is 
> quite costly too.
>
> For example, this is what made us (all D compilers) look bad on 
> that Perlin noise microbenchmark (the thread from a couple of 
> months ago).

Ah, ok. Didn't realize.

For future reference:
http://gruntthepeon.free.fr/ssemath