std.math performance (SSE vs. real)
David Nadlinger via Digitalmars-d
digitalmars-d at puremagic.com
Fri Jun 27 03:47:42 PDT 2014
On Friday, 27 June 2014 at 09:37:54 UTC, hane wrote:
> On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via
> Digitalmars-d wrote:
>> Can you test with this?
>>
>> https://github.com/D-Programming-Language/phobos/pull/2274
>>
>> Float and Double implementations of floor/ceil are trivial and
>> I can add later.
>
> Nice! I tested with the Perlin noise benchmark, and it got
> faster(in my environment, 1.030s -> 0.848s).
> But floor still consumes almost half of the execution time.
Wait, so DMD and GDC did actually emit a memcpy/… here? LDC
doesn't, and the change didn't have much of an impact on
performance.
What _does_ have a significant impact, however, is that the whole
of floor() for doubles can be optimized down to
roundsd <…>,<…>,0x1
when targeting SSE 4.1, or
vroundsd <…>,<…>,<…>,0x1
when targeting AVX.
This is why std.math will need to build on top of
compiler-recognizable primitives. Iain, Don, how do you think we
should handle this? One option would be to build std.math based
on an extended core.math with functions that are recognized as
intrinsics or suitably implemented in the compiler-specific
runtimes. The other option would be for me to submit LDC-specific
implementations to Phobos.
Cheers,
David
More information about the Digitalmars-d
mailing list