Why is D slower than LuaJIT?

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Wed Dec 22 18:16:45 PST 2010


On 12/22/10 4:04 PM, Andreas Mayer wrote:
> To see what performance advantage D would give me over using a scripting language, I made a small benchmark. It consists of this code:
>
>>     auto L = iota(0.0, 10000000.0);
>>     auto L2 = map!"a / 2"(L);
>>     auto L3 = map!"a + 2"(L2);
>>     auto V = reduce!"a + b"(L3);
>
> It runs in 281 ms on my computer.

Thanks for posting the numbers. That's a long time, particularly 
considering that the two map instances don't do anything. So the bulk of 
the computation is:

auto L = iota(0.0, 10000000.0);
auto V = reduce!"a + b"(L3);

There is one inherent problem that affects the speed of iota: in iota, 
the value at position i is computed as 0.0 + i * step, where step is 
computed from the limits. That's one addition and a multiplication for 
each pass through iota. Given that the actual workload of the loop is 
only one addition, we are doing a lot more work. I suspect that that's 
the main issue there.

The reason for which iota does that instead of the simpler increment is 
that iota must iterate the same values forward and backward. Using ++ 
may interact with floating-point vagaries, so the code is currently 
conservative.

Another issue is the implementation of reduce. Reduce is fairly general 
which may mean that it generates mediocre code for that particular case. 
We can always optimize the general case and perhaps specialize for 
select cases.

Once we figure where the problem is, there are numerous possibilities to 
improve the code:

1. Have iota check in the constructor whether the limits allow ++ to be 
precise. If so, use that. Of course, that means an extra runtime test...

3. Give up on iota being a random access or bidirectional range. If it's 
a forward range, we don't need to worry about going backwards.

4. Improve reduce as described above.

> The same code in Lua (using LuaJIT) runs in 23 ms.
>
> That's about 10 times faster. I would have expected D to be faster. Did I do something wrong?
>
> The first Lua version uses a simplified design. I thought maybe that is unfair to ranges, which are more complicated. You could argue ranges have more features and do more work. To make it fair, I made a second Lua version of the above benchmark that emulates ranges. It is still 29 ms fast.
>
> The full D version is here: http://pastebin.com/R5AGHyPx
> The Lua version: http://pastebin.com/Sa7rp6uz
> Lua version that emulates ranges: http://pastebin.com/eAKMSWyr
>
> Could someone help me solving this mystery?
>
> Or is D, unlike I thought, not suitable for high performance computing? What should I do?

Thanks very much for taking the time to measure and post results, this 
is very helpful. As this test essentially measures the performance of 
iota and reduce, it would be hasty to generalize the assessment. 
Nevertheless, we need to look into improving this particular microbenchmark.

Please don't forget to print the result of the computation in both 
languages, as there's always the possibility of some oversight.


Andrei


More information about the Digitalmars-d mailing list