std.math performance (SSE vs. real)
Don via Digitalmars-d
digitalmars-d at puremagic.com
Mon Jun 30 00:20:59 PDT 2014
On Monday, 30 June 2014 at 04:15:46 UTC, Walter Bright wrote:
> On 6/29/2014 8:22 PM, Manu via Digitalmars-d wrote:
>> Well, here's the thing then. Consider that 'real' is only
>> actually
>> supported on only a single (long deprecated!) architecture.
>> In x64's case, it is deprecated for over a decade now, and may
>> be
>> removed from the hardware at some unknown time. The moment
>> that x64
>> processors decide to stop supporting 32bit code, the x87 will
>> go away,
>> and those opcodes will likely be emulated or microcoded.
>> Interacting real<->float/double means register swapping through
>> memory. It should be treated the same as float<->simd; they are
>> distinct (on most arch's).
>
> Since they are part of the 64 bit C ABI, that would seem to be
> in the category of "nevah hoppen".
What I think is highly likely is that it will only have legacy
support, with such awful performance that it never makes sense to
use them. For example, the speed of 80-bit and 64-bit
calculations in x87 used to be identical. But on recent Intel
CPUs, the 80-bit operations run at half the speed of the 64 bit
operations. They are already partially microcoded.
For me, a stronger argument is that you can get *higher*
precision using doubles, in many cases. The reason is that FMA
gives you an intermediate value with 128 bits of precision; it's
available in SIMD but not on x87.
So, if we want to use the highest precision supported by the
hardware, that does *not* mean we should always use 80 bits.
I've experienced this in CTFE, where the calculations are
currently done in 80 bits, I've seen cases where the 64-bit
runtime results were more accurate, because of those 128 bit FMA
temporaries. 80 bits are not enough!!
More information about the Digitalmars-d
mailing list