std.math performance (SSE vs. real)

Mon Jun 30 09:54:17 PDT 2014

On 6/30/2014 12:20 AM, Don wrote:
> What I think is highly likely is that it will only have legacy support, with
> such awful performance that it never makes sense to use them. For example, the
> speed of 80-bit and 64-bit calculations in x87 used to be identical. But on
> recent Intel CPUs, the 80-bit operations run at half the speed of the 64 bit
> operations. They are already partially microcoded.
>
> For me, a stronger argument is that you can get *higher* precision using
> doubles, in many cases. The reason is that FMA gives you an intermediate value
> with 128 bits of precision; it's available in SIMD but not on x87.
>
> So, if we want to use the highest precision supported by the hardware, that does
> *not* mean we should always use 80 bits.
>
> I've experienced this in CTFE, where the calculations are currently done in 80
> bits, I've seen cases where the 64-bit runtime results were more accurate,
> because of those 128 bit FMA temporaries. 80 bits are not enough!!

I did not know this. It certainly adds another layer of nuance - as the higher 
level of precision will only apply as long as one can keep the value in a register.