Performance problem in reverse algorithm

Fool via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Sun Aug 24 09:45:10 PDT 2014


Below there is a snippet of D code followed by a close C++ 
translation.

clang++ seems to have some problems optimizing the C++ version if 
my_reverse is not attributed "noinline". Unfortunately, I was not 
able to convince ldc to produce a fast executable. g++ and gdc 
both produce good results; annotation is not required here.

The same effect is visible in the D version if one replaces 
my_reverse with reverse from std.algorithm. Can someone figure 
out what the problem is?

clang++ /   inline: 5.7s
     ldc /   inline: 5.7s
clang++ / noinline: 1.4s
     ldc / noinline: 5.8s
                g++: 1.2s
                gdc: 1.2s

Test environment: Arch Linux x86_64

clang version 3.4.2
LDC - the LLVM D compiler (0.14.0)
g++ (GCC) 4.9.1
gdc (GCC) 4.9.1

clang++ -std=c++11 -march=native -O3 -DNOINLINE
ldc2 -O3 -mcpu=native -release -disable-boundscheck
g++ -std=c++11 -march=native -O3
gdc -O3 -march=native -frelease -fno-bounds-check

Moreover, if I do not specify '-disable-boundscheck' ldc2 
produces the following error message:

/usr/lib/liblphobos2.a(curl.o): In function 
`_D3std3net4curl4HTTP18_sharedStaticCtor1FZv':
(.text._D3std3net4curl4HTTP18_sharedStaticCtor1FZv+0x10): 
undefined reference to `curl_version_info'
/usr/lib/liblphobos2.a(curl.o): In function 
`_D3std3net4curl4Curl18_sharedStaticDtor3FZv':
(.text._D3std3net4curl4Curl18_sharedStaticDtor3FZv+0x1): 
undefined reference to `curl_global_cleanup'
/usr/lib/liblphobos2.a(curl.o): In function 
`_D3std3net4curl13__shared_ctorZ':
(.text._D3std3net4curl13__shared_ctorZ+0x10): undefined reference 
to `curl_global_init'
collect2: error: ld returned 1 exit status
Error: /usr/bin/gcc failed with status: 1


////
// reverse.d

import std.algorithm, std.range;
import ldc.attribute;

@attribute("noinline")
void my_reverse(int* b, int* e)
{
    auto steps = (e - b) / 2;
    if (steps) {
       auto l = b;
       auto r = e - 1;
       do {
          swap(*l, *r);
          ++l;
          --r;
       } while (--steps);
    }
}

void main(string[] args)
{
    immutable N = 2000;
    immutable K = 10000;
    auto a = iota(N).array;
    for (auto n = 0; n <= N; ++n) {
       for (auto k = 0; k <= K; ++k) {
          my_reverse(&a[0], &a[0] + n);
       }
    }
}


////
// reverse.cpp

#include <numeric>
#include <vector>

#ifdef NOINLINE
__inline__ __attribute__((noinline))
#endif
void my_reverse(int* b, int* e)
{
    auto steps = (e - b) / 2;
    if (steps) {
       auto l = b;
       auto r = e - 1;
       do {
          std::swap(*l, *r);
          ++l;
          --r;
       } while (--steps);
    }
}

int main()
{
    const auto N = 2000;
    const auto K = 10000;
    std::vector<int> a(N);
    auto b = std::begin(a);
    auto e = std::end(a);
    std::iota(b, e, 0);
    for (auto n = 0; n <= N; ++n) {
       for (auto k = 0; k <= K; ++k) {
          my_reverse(&a[0], &a[0] + n);
       }
    }
}


More information about the digitalmars-d-ldc mailing list