[Issue 21560] md5 poor performance out of the box

d-bugmail at puremagic.com d-bugmail at puremagic.com
Thu Jan 21 01:54:14 UTC 2021


https://issues.dlang.org/show_bug.cgi?id=21560

--- Comment #9 from Witold Baryluk <witold.baryluk+d at gmail.com> ---
In-memory tests, on 16384 byte blocks (should fit nicely in caches).

OpenSSL 1.1.1i (gcc-10.2.1 -fPIC -O2 -fstack-protector-strong ... -DOPENSSL_PIC
-DMD5_ASM ...):
833MB/s

standard/optimized (non-asm) C version with clang-11.0.0 -O3 -march=native
-flto -fomit-frame-pointer:
707MB/s

standard/optimized (non-asm) C version with gcc-10.2.1 -O3 -march=native -flto
-fomit-frame-pointer:
594MB/s

hand optimized x86-64 assembly with gcc-10.2.1 ....:
716MB/s

hand optimized x86-64 assembly with clang-11.0.0 ....:
717MB/s

md5sum-coreutils-8.32-4 on big files in tmpfs (uses 32KiB buffers, but also
doing syscalls), C + gcc-10:
763MB/s

md5sum-busybox-static-1.30.1-6 on big files in tmpfs (uses 32KiB buffers, but
also doing syscalls), C + gcc-10:
565MB/s

D / phobos:

gdc-10.2.1 -O3 -march=native -frelease -fno-weak  (using shared Phobos, which
uses -fPIC, from Debian testing)
569MB/s

dmd-2.095 -O -inline -release (precompiled Phobos from dlang.org binary
release, statically linked)
120MB/s

ldc2-1.24.0 -O3 -release (precompiled Phobos from Debian testing, dynamically
linked)
677MB/s

dmd-2.095 -O -inline -release -mcpu=avx2 -boundscheck=no + hand compiled Phobos
with same options, statically linked.
544MB/s



"performance" cpu frequency governor, no other load on system. Reruns were 10s+
each, few MB/s variations between reruns.



So, ldc2 actually does very good. Approaching the performance of pure-C version
compiled with clang.

gdc despite poor codegen with -fPIC in MD5.transform (missed a lot of inlining
opportunities for 1–2-instruction functions), is close to pure-C version
compiled with gcc.

dmd. It depends how you compile the Phobos apparently. The version distributed
on dlang.org, and as built by default, does poorly. Properly compiled it
actually doesn't do too bad.

The pre-compiled version works horribly tho.

--


More information about the Digitalmars-d-bugs mailing list