Rquest for timings

Jerry jlquinn at optonline.net
Fri Nov 25 22:14:54 PST 2011


bearophile <bearophileHUGS at lycos.com> writes:

> This is the nbody benchmark of the Shootout site:
> http://shootout.alioth.debian.org/u32/performance.php?test=nbody
>
> The faster version is a Fortran one, probably thanks to vector operations that allow a better SIMD vectorization.
>
> This is the C++ version:
> http://shootout.alioth.debian.org/u32/program.php?test=nbody&lang=gpp&id=1
>
> C++ version compiled with:
> g++ -Ofast -fomit-frame-pointer -march=native -mfpmath=sse -msse3 --std=c++0x
> An input parameter is n= 3_000_000 (but use a larger number if therun
> time is too much small, like 10 millions or more).

All timings done with gdc 0.30 using dmd 2.055 and gcc 4.6.2.  I built
with both D and C++ enabled so the back end would be the same.

jlquinn at wyvern:~/d/tests$ ~/gcc/gdc/bin/g++ -O3 -fomit-frame-pointer -march=native -lm -mfpmath=sse -msse3 --std=c++0x nbody.cc -o nbody_c++
jlquinn at wyvern:~/d/tests$ time ./nbody_c++ 50000000
-0.169075164
-0.169059907

real	0m10.209s
user	0m10.180s
sys	0m0.010s


> First D2 version (serial):
> http://codepad.org/AdRSm2wP

~/gcc/gdc/bin/gdc -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3 -frelease nbody.d -o nbody_d
jlquinn at wyvern:~/d/tests$ time ./nbody_d 50000000
-0.169075164
-0.169059907

real	0m9.830s
user	0m9.820s
sys	0m0.000s

jlquinn at wyvern:~/d/tests$ dmd -O -release nbody.d
jlquinn at wyvern:~/d/tests$ time ./nbody_d 50000000
-0.169075164
-0.169059907

real	0m9.828s
user	0m9.830s
sys	0m0.000s

> Second D2 version, three times slower thanks to vector ops, more similar to the Fortran version:
> http://codepad.org/7O3mz9en

~/gcc/gdc/bin/gdc -O3 -fomit-frame-pointer -march=native -mfpmath=sse -msse3 -frelease nbody2.d -o nbody2_d
jlquinn at wyvern:~/d/tests$ time ./nbody2_d 50000000
-0.169075164
-0.169059907

real	0m26.805s
user	0m26.760s
sys	0m0.020s

jlquinn at wyvern:~/d/tests$ dmd -O -release nbody2.d
jlquinn at wyvern:~/d/tests$ time ./nbody2_d 50000000
-0.169075164
-0.169059907

real	0m26.777s
user	0m26.760s
sys	0m0.000s


> Is someone willing to take two (or more) timings using LDC 2 compiler (I have LDC1 only, beside DMD)? I'd like to know how much time it takes to run the first D version *compared* to the C++ version :-) If you time the second D2 version too, then it's better.
>
> Bye and thank you,
> bearophile

So, the upshot seems like DMD and GDC generate similar code for this
test.  And both D compilers generate slightly faster code than the C++
version, therefore the D front end is doing a slightly better
optimization job, or your first version is slightly more efficient code.

Jerry


More information about the Digitalmars-d-learn mailing list