Speed of math function atan: comparison D and C++

Mon Mar 5 18:09:37 UTC 2018

On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote:
> On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
>> On 05/03/2018 6:35 PM, J-S Caux wrote:
>>> I'm considering shifting a large existing C++ codebase into D 
>>> (it's a scientific code making much use of functions like 
>>> atan, log etc).
>>> 
>>> I've compared the raw speed of atan between C++ (Apple LLVM 
>>> version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also 
>>> ldc2 1.7.0) by doing long loops of such functions.
>>> 
>>> I can't get the D to run faster than about half the speed of 
>>> C++.
>
>   double x = 0.0;
>   for (int a = 0; a < 1000000000; ++a) x += atan(1.0/(1.0 + 
> sqrt(1.0 + a)));
>
> for C++ and
>
>   double x = 0.0;
>   for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + 
> sqrt(1.0 + a)));
>
> for D. C++ exec takes 40 seconds, D exec takes 68 seconds.

The performance problem with this code is that LDC does not yet 
do cross-module inlining by default. GDC does. If you pass 
`-enable-cross-module-inlining` to LDC, things should be faster. 
In particular, std.sqrt is not inlined although it is profitable 
to do so (it becomes one machine instruction). Things become 
worse when using core.stdc.math.sqrt, because no implementation 
source available: no inlining possible.

Another problem is that std.math.atan(double) just calls 
std.math.atan(real). Calculations are more expensive on platforms 
where real==80bits (i.e. x86), and that's not solvable with a 
compile flag. What it takes is someone to write the double and 
float versions of atan (and other math functions), but it 
requires someone with the right knowledge to do it.

Your tests (and reporting about them) are much appreciated. 
Please do file bug reports for these things. Perhaps you can take 
a stab at implementing double-versions of the functions you need?

cheers,
   Johan