std.math performance (SSE vs. real)

David Nadlinger via Digitalmars-d digitalmars-d at puremagic.com
Fri Jun 27 03:47:42 PDT 2014


On Friday, 27 June 2014 at 09:37:54 UTC, hane wrote:
> On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via 
> Digitalmars-d wrote:
>> Can you test with this?
>>
>> https://github.com/D-Programming-Language/phobos/pull/2274
>>
>> Float and Double implementations of floor/ceil are trivial and 
>> I can add later.
>
> Nice! I tested with the Perlin noise benchmark, and it got 
> faster(in my environment, 1.030s -> 0.848s).
> But floor still consumes almost half of the execution time.

Wait, so DMD and GDC did actually emit a memcpy/… here? LDC 
doesn't, and the change didn't have much of an impact on 
performance.

What _does_ have a significant impact, however, is that the whole 
of floor() for doubles can be optimized down to
     roundsd <…>,<…>,0x1
when targeting SSE 4.1, or
     vroundsd <…>,<…>,<…>,0x1
when targeting AVX.

This is why std.math will need to build on top of 
compiler-recognizable primitives. Iain, Don, how do you think we 
should handle this? One option would be to build std.math based 
on an extended core.math with functions that are recognized as 
intrinsics or suitably implemented in the compiler-specific 
runtimes. The other option would be for me to submit LDC-specific 
implementations to Phobos.

Cheers,
David


More information about the Digitalmars-d mailing list