std.math performance (SSE vs. real)
ed via Digitalmars-d
digitalmars-d at puremagic.com
Mon Jun 30 00:51:50 PDT 2014
On Monday, 30 June 2014 at 07:21:00 UTC, Don wrote:
> On Monday, 30 June 2014 at 04:15:46 UTC, Walter Bright wrote:
>> On 6/29/2014 8:22 PM, Manu via Digitalmars-d wrote:
>>> Well, here's the thing then. Consider that 'real' is only
>>> actually
>>> supported on only a single (long deprecated!) architecture.
>
>>> In x64's case, it is deprecated for over a decade now, and
>>> may be
>>> removed from the hardware at some unknown time. The moment
>>> that x64
>>> processors decide to stop supporting 32bit code, the x87 will
>>> go away,
>>> and those opcodes will likely be emulated or microcoded.
>>> Interacting real<->float/double means register swapping
>>> through
>>> memory. It should be treated the same as float<->simd; they
>>> are
>>> distinct (on most arch's).
>>
>> Since they are part of the 64 bit C ABI, that would seem to be
>> in the category of "nevah hoppen".
>
> What I think is highly likely is that it will only have legacy
> support, with such awful performance that it never makes sense
> to use them. For example, the speed of 80-bit and 64-bit
> calculations in x87 used to be identical. But on recent Intel
> CPUs, the 80-bit operations run at half the speed of the 64 bit
> operations. They are already partially microcoded.
>
> For me, a stronger argument is that you can get *higher*
> precision using doubles, in many cases. The reason is that FMA
> gives you an intermediate value with 128 bits of precision;
> it's available in SIMD but not on x87.
>
> So, if we want to use the highest precision supported by the
> hardware, that does *not* mean we should always use 80 bits.
>
> I've experienced this in CTFE, where the calculations are
> currently done in 80 bits, I've seen cases where the 64-bit
> runtime results were more accurate, because of those 128 bit
> FMA temporaries. 80 bits are not enough!!
This is correct and we use this now for some time critical code
that requires high precision.
But anything non-time critical (~80%-85% of our code) we simply
use a software solution when precision becomes an issue. It is
here that I think the extra bits in D real can be enough to get a
performance gain.
But I won't argue with you think I'm wrong. I'm only basing this
on anecdotal evidence of what I saw from 5-6 apps ported from C++
to D :-)
Cheers,
ed
More information about the Digitalmars-d
mailing list