D outperformed by C++, what am I doing wrong?

rikki cattermole via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Aug 12 23:11:52 PDT 2017


On 13/08/2017 7:09 AM, amfvcg wrote:
> Hi all,
> I'm solving below task:
> 
> given container T and value R return sum of R-ranges over T. An example:
> input : T=[1,1,1] R=2
> output : [2, 1]
> 
> input : T=[1,2,3] R=1
> output : [1,2,3]
> (see dlang unittests for more examples)
> 
> 
> Below c++ code compiled with g++-5.4.0 -O2 -std=c++14 runs on my machine 
> in 656 836 us.
> Below D code compiled with dmd v2.067.1 -O runs on my machine in ~ 14.5 
> sec.
> 
> Each language has it's own "way of programming", and as I'm a beginner 
> in D - probably I'm running through bushes instead of highway. Therefore 
> I'd like to ask you, experienced dlang devs, to shed some light on "how 
> to do it dlang-way".
> 
> 
> C++ code:
> 
> #include <algorithm>
> #include <chrono>
> #include <iostream>
> #include <iterator>
> #include <list>
> #include <map>
> #include <string>
> #include <utility>
> #include <numeric>
> #include <vector>
> 
> 
> template<typename T, typename K>
> std::vector<K> sum_elements(const T& beg, const T& end, std::size_t k, K 
> def)
> {
>   if (k == 0) {
>       return std::vector<K>{};
>   }
>   return sum_elements(beg, end, k, def, [](auto &l, auto &r){ return 
> r+l;});
> }
> 
> template<typename T, typename K, class BinaryOp>
> std::vector<K>
>      sum_elements(
>              const T& beg,
>              const T& end,
>              std::size_t k,
>              K def,
>              BinaryOp op)
> {
>      std::vector<K> out;
>      out.reserve((std::distance(beg, end) - 1)/k + 1);
>      for (auto it = beg; it!=end; std::advance(it, 
> std::min(static_cast<std::size_t>(std::distance(it, end)), k)))
>      {
>          out.push_back(std::accumulate(it, std::next(it, 
> std::min(static_cast<std::size_t>(std::distance(it, end)), k)), def, op));
>      }
>      return out;
> }
> 
> int main()
> {
>      std::vector<int> vec;
>      auto size = 1000000;
>      vec.reserve(size);
>      for (int i=0; i < size; ++i)
>          vec.push_back(i);
>      auto beg = std::chrono::system_clock::now();
>      auto sum = 0;
>      for (int i=0; i < 100; i++)
>          sum += sum_elements(vec.begin(), vec.end(), 2, 0).size();
>      auto end = std::chrono::system_clock::now();
>      std::cout << 
> std::chrono::duration_cast<std::chrono::microseconds>(end-beg).count() 
> << std::endl;
>      std::cout << sum << std::endl;
> 
>      return sum;
> }
> 
> 
> D code:
> 
> import std.stdio : writeln;
> import std.algorithm.comparison: min;
> import std.algorithm.iteration: sum;
> import core.time: MonoTime, Duration;
> 
> 
> T[] sum_subranges(T)(T[] input, uint range)
> {
>      T[] result;
>      if (range == 0)
>      {
>          return result;
>      }
>      for (uint i; i < input.length; i=min(i+range, input.length))
>      {
>          result ~= sum(input[i..min(i+range, input.length)]);
>      }
>      return result;
> }
> 
> unittest
> {
>      assert(sum_subranges([1,1,1], 2) == [2, 1]);
>      assert(sum_subranges([1,1,1,2,3,3], 2) == [2, 3, 6]);
>      assert(sum_subranges([], 2) == []);
>      assert(sum_subranges([1], 2) == [1]);
>      assert(sum_subranges([1], 0) == []);
> }
> 
> 
> int main()
> {
>      int[1000000] v;
>      for (int i=0; i < 1000000; ++i)
>          v[i] = i;
>      int sum;
>      MonoTime beg = MonoTime.currTime;
>      for (int i=0; i < 100; i++)
>          sum += cast(int)sum_subranges(v,2).length;
>      MonoTime end = MonoTime.currTime;
>      writeln(end-beg);
>      writeln(sum);
>      return sum;
> }


Dmd compiles quickly, but doesn't create all that optimized code.
Try ldc or gdc and get back to us about it ;)



More information about the Digitalmars-d-learn mailing list