std.math performance (SSE vs. real)
Tofu Ninja via Digitalmars-d
digitalmars-d at puremagic.com
Thu Jun 26 19:09:59 PDT 2014
On Friday, 27 June 2014 at 01:31:17 UTC, David Nadlinger wrote:
> Hi all,
>
> right now, the use of std.math over core.stdc.math can cause a
> huge performance problem in typical floating point graphics
> code. An instance of this has recently been discussed here in
> the "Perlin noise benchmark speed" thread [1], where even LDC,
> which already beat DMD by a factor of two, generated code more
> than twice as slow as that by Clang and GCC. Here, the use of
> floor() causes trouble. [2]
>
> Besides the somewhat slow pure D implementations in std.math,
> the biggest problem is the fact that std.math almost
> exclusively uses reals in its API. When working with single- or
> double-precision floating point numbers, this is not only more
> data to shuffle around than necessary, but on x86_64 requires
> the caller to transfer the arguments from the SSE registers
> onto the x87 stack and then convert the result back again.
> Needless to say, this is a serious performance hazard. In fact,
> this accounts for an 1.9x slowdown in the above benchmark with
> LDC.
>
> Because of this, I propose to add float and double overloads
> (at the very least the double ones) for all of the commonly
> used functions in std.math. This is unlikely to break much
> code, but:
> a) Somebody could rely on the fact that the calls effectively
> widen the calculation to 80 bits on x86 when using type
> deduction.
> b) Additional overloads make e.g. "&floor" ambiguous without
> context, of course.
>
> What do you think?
>
> Cheers,
> David
>
>
> [1] http://forum.dlang.org/thread/lo19l7$n2a$1@digitalmars.com
> [2] Fun fact: As the program happens only deal with positive
> numbers, the author could have just inserted an int-to-float
> cast, sidestepping the issue altogether. All the other language
> implementations have the floor() call too, though, so it
> doesn't matter for this discussion.
I honestly alway thought that it was a little odd that it forced
conversion to real. Personally I support this. It would also make
generic code that calls math functions more simple as it wouldn't
require casts back.
More information about the Digitalmars-d
mailing list