Small part of a program : d and c versions performances diff.
bearophile via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Wed Jul 9 05:25:39 PDT 2014
Larry:
> Now the performance :
> D : 12 µs
> C : < 1µs
>
> Where does the diff comes from ? Is there a way to optimize the
> d version ?
>
> Again, I am absolutely new to D and those are my very first
> line of code with it.
Your C code is not equivalent to the D code, there are small
differences, even the output is different. So I've cleaned up
your C and D code:
------------------------
// C code.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include "jol.h"
int main() {
struct timeval s, e;
gettimeofday(&s, NULL);
int pol = 5;
tes(&pol);
int arr[] = {9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78,
985, 3215};
int len = 13 - 1;
int g = 0;
for (int x = 36; x >= 0; --x) {
for (int y = len; y >= 0; --y) {
++g;
arr[y]++;
}
}
gettimeofday(&e, NULL);
printf("C: %d %lu %d %d %d\n",
g, e.tv_usec - s.tv_usec, arr[4], arr[9], pol);
return 0;
}
------------------------
D code ("final" functions have not much meaning, but the D
compiler is very sloppy and doesn't complain):
module jol;
void tes(ref int a) {
a = 9;
}
---------
module maind;
void main() {
import std.stdio;
import std.datetime;
import jol;
StopWatch sw;
sw.start;
int pol = 5;
tes(pol);
int[] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78,
985, 3215];
int len = 13 - 1;
int g = 0;
for (int x = 36; x >= 0; --x) {
// Some code here erased for the test.
for (int y = len; y >= 0; --y) {
// Some other code here.
++g;
arr[y]++;
}
}
sw.stop;
writefln("D: %d %d %d %d %d",
g, sw.peek.nsecs, arr[4], arr[9], pol);
}
----------------
That D code is not fully idiomatic, this is closer to idiomatic D
code:
module jol2;
void test(ref int x) pure nothrow @safe {
x = 9;
}
module maind;
void main() {
import std.stdio, std.datetime;
import jol2;
StopWatch sw;
sw.start;
int pol = 5;
test(pol);
int[13] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78,
985, 3215];
uint count = 0;
foreach_reverse (immutable _; 0 .. 37) {
foreach_reverse (ref ai; arr) {
count++;
ai++;
}
}
sw.stop;
writefln("D: %d %d %d %d %d",
count, sw.peek.nsecs, arr[4], arr[9], pol);
}
----------------
In my benchmarks I don't have used the more idiomatic D code, I
have used the C-like code. But the run-time is essentially the
same.
I compile the C and D code with (on a 32 bit Windows):
gcc -march=native -std=c11 -O2 main.c jol.c -o main
ldmd2 -wi -O -release -inline -noboundscheck maind.d jol.d
strip maind.exe
For the D code I've used the latest ldc2 compiler (V. 0.13.0,
based on DMD v2.064 and LLVM 3.4.2), GCC is V.4.8.0
(rubenvb-4.8.0).
----------------
The C code gives as ouput:
C: 481 0 105 602 9
The D code gives as output:
D: 481 6076 105 602 9
----------------------
If I slow down the CPU at half speed the C code runs in about
0.05 seconds, the D code runs in about 0.07 seconds.
Such run times are too much small to perform a sufficiently
meaningful comparison. You need a run-time of about 2 seconds to
get meaningful timings.
The difference between 0.05 and 0.07 is caused by initializing
the D rutime (like the D GC), it takes about 0.015 seconds on my
systems at full speed CPU to initialize the D runtime, and it's a
constant time.
Bye,
bearophile
More information about the Digitalmars-d-learn
mailing list