A significant performance difference

Mon Sep 1 00:21:03 PDT 2014

V Sun, 31 Aug 2014 10:55:31 +0000
bearophile via Digitalmars-d-learn <digitalmars-d-learn at puremagic.com>
napsáno:

> This is C++ code that solves one Euler problem:
> 
> --------------
> #include <stdio.h>
> #include <map>
> 
> const unsigned int H = 9, W = 12;
> 
> const int g[6][3] = {{7, 0, H - 3},
>                       {1 + (1 << H) + (1 << (2 * H)), 0, H - 1},
>                       {3 + (1 << H), 0, H - 2},
>                       {3 + (2 << H), 0, H - 2},
>                       {1 + (1 << H) + (2 << H), 0, H - 2},
>                       {1 + (1 << H) + (1 << (H - 1)), 1, H - 1}};
> 
> int main() {
>      unsigned long long p, i, k;
>      unsigned int j, l;
>      std::map<unsigned int, unsigned long long> x, y;
>      x[0] = 1;
> 
>      for (i = 0; i < W; ++i) {
>          y.clear();
>          while (!x.empty()) {
>              j = x.begin()->first;
>              p = x.begin()->second;
>              x.erase(x.begin());
> 
>              for (k = 0; k < H; ++k)
>                  if ((j & (1 << k)) == 0)
>                      break;
> 
>              if (k == H)
>                  y[j >> H] += p;
>              else
>                  for (l = 0; l < 6; ++l)
>                      if (k >= g[l][1] && k <= g[l][2])
>                          if ((j & (g[l][0] << k)) == 0)
>                              x[j + (g[l][0] << k)] += p;
>          }
>          x = y;
>      }
> 
>      printf("%lld\n", y[0]);
>      return 0;
> }
> --------------
> 
> 
> I have translated it to D like this (I know in D there are nicer 
> ways to write it, but I have tried to keep the look of the code 
> as much similar as possible to the C++ code):
> 
> 
> --------------
> import core.stdc.stdio;
> 
> const uint H = 9, W = 12;
> 
> const uint[3][6] g = [[7, 0, H - 3],
>                        [1 + (1 << H) + (1 << (2 * H)), 0, H - 1],
>                        [3 + (1 << H), 0, H - 2],
>                        [3 + (2 << H), 0, H - 2],
>                        [1 + (1 << H) + (2 << H), 0, H - 2],
>                        [1 + (1 << H) + (1 << (H - 1)), 1, H - 1]];
> 
> int main() {
>      ulong p, i, k;
>      uint j, l;
>      ulong[uint] x, y;
>      x[0] = 1;
> 
>      for (i = 0; i < W; ++i) {
>          y = null;
>          while (x.length) {
>              j = x.byKey.front;
>              p = x.byValue.front;
>              x.remove(cast(int)j);
> 
>              for (k = 0; k < H; ++k)
>                  if ((j & (1 << k)) == 0)
>                      break;
> 
>              if (k == H)
>                  y[j >> H] += p;
>              else
>                  for (l = 0; l < 6; ++l)
>                      if (k >= g[l][1] && k <= g[l][2])
>                          if ((j & (g[l][0] << k)) == 0)
>                              x[j + (g[l][0] << k)] += p;
>          }
>          x = y;
>      }
> 
>      printf("%lld\n", y[0]);
>      return 0;
> }
> --------------
> 
> 
> The C++ code is much faster than the D code (I see the D code 30+ 
> times slower with dmd and about like 20 times with ldc2). One 
> difference between the C++ and D code is that the C++ map uses a 
> search tree (red-black probably), while the D code uses a hash.
> 
> To test that algorithmic difference, if I replace the map in the 
> C++ code with a std::unordered_map (C++11):
> 
> #include <unordered_map>
> ...
> std::unordered_map<unsigned int, unsigned long long> x, y;
> 
> 
> then the run-time increases (more than two times) but it's still 
> much faster than the D code.
> 
> Is it possible to fix the D code to increase its performance 
> (there are associative array libraries for D, but I have not 
> tried them in this program).
> 
> Bye,
> bearophile

I think main problem is in calling delegates (x.byKey.front and
x.byValue.front;). If I change x.byValue.front with x[j], It makes
program a lot faster