Slow performance compared to C++, ideas?
nazriel
spam at dzfl.pl
Thu May 30 22:42:53 PDT 2013
On Friday, 31 May 2013 at 05:35:58 UTC, Juan Manuel Cabo wrote:
> On 05/31/2013 02:15 AM, nazriel wrote:
>> On Friday, 31 May 2013 at 01:26:13 UTC, finalpatch wrote:
>>> Recently I ported a simple ray tracer I wrote in C++11 to D.
>>> Thanks to the similarity between D and C++ it was almost
>>> a line by line translation, in other words, very very close.
>>> However, the D verson runs much slower than the C++11
>>> version. On Windows, with MinGW GCC and GDC, the C++ version
>>> is twice as fast as the D version. On OSX, I used Clang++
>>> and LDC, and the C++11 version was 4x faster than D verson.
>>> Since the comparison were between compilers that share
>>> the same codegen backends I suppose that's a relatively fair
>>> comparison. (flags used for GDC: -O3 -fno-bounds-check
>>> -frelease, flags used for LDC: -O3 -release)
>>>
>>> I really like the features offered by D but it's the raw
>>> performance that's worrying me. From what I read D should
>>> offer similar performance when doing similar things but my
>>> own test results is not consistent with this claim. I want
>>> to know whether this slowness is inherent to the language or
>>> it's something I was not doing right (very possible
>>> because I have only a few days of experience with D).
>>>
>>> Below is the link to the D and C++ code, in case anyone is
>>> interested to have a look.
>>>
>>> https://dl.dropboxusercontent.com/u/974356/raytracer.d
>>> https://dl.dropboxusercontent.com/u/974356/raytracer.cpp
>>
>> Greetings.
>>
>> After few fast changes I manage to get such results:
>> [raz at d3 tmp]$ ./a.out
>> rendering time 276 ms
>> [raz at d3 tmp]$ ./test
>> 346 ms, 814 μs, and 5 hnsecs
>>
>>
>> ./a.out being binary compiled with clang++ ./test.cxx
>> -std=c++11 -lSDL -O3
>> ./test being binary compiled with ldmd2 -O3 -release -inline
>> -noboundscheck ./test.d (Actually I used rdmd with
>> --compiler=ldmd2 but I omitted it because it was rather long
>> cmd line :p)
>>
>>
>> Here is source code with changes I applied to D-code (I hope
>> you don't mind repasting it): http://dpaste.dzfl.pl/84bb308d
>>
>> I am sure there is way more room for improvements and at
>> minimum achieving C++ performance.
>
>
> You might also try changing:
>
> float[3] t = mixin("v[]"~op~"rhs.v[]");
> return Vec3(t[0], t[1], t[2]);
>
> for:
> Vec3 t;
> t.v[0] = mixin("v[0] "~op~" rhs.v[0]");
> t.v[1] = mixin("v[1] "~op~" rhs.v[1]");
> t.v[2] = mixin("v[2] "~op~" rhs.v[2]");
> return t;
>
> and so on, avoiding the float[3] and the v[] operations (which
> would
> loop, unless the compiler/optimizer unrolls them (didn't
> check)).
>
> I tested this change (removing v[] ops) in Vec3 and in
> normalize(), and it made your version slightly faster
> with DMD (didn't check with ldmd2).
>
> --jm
Right, I missed that. Thanks!
Now it is:
[raz at d3 tmp]$ ./a.out
rendering time 276 ms
[raz at d3 tmp]$ ./test
238 ms, 35 μs, and 7 hnsecs
So D version starts to be faster than C++ one.
More information about the Digitalmars-d
mailing list