A benchmark, mostly GC

dsimcha dsimcha at yahoo.com
Sun Dec 11 07:23:25 PST 2011


On 12/11/2011 8:48 AM, bearophile wrote:
> This program used here as a benchmark is a bit reduced from a rosettacode task, it finds how many ways are there to make change for 100_000 euro (the argument 'amount' represents cents of euro) using the common coins.
>
> The result is:
> 992198221207406412424859964272600001
>
> The timings, best of 3, seconds:
>    DMD:    22.5
>    Python:  9.3
>    Java:    2.9
>
> DMD 2.057beta, -O -release -inline
> Java SE 1.7.0_01-b08 (used without -server)
> Python 2.6.6
> 32 bit Windows system.
>
> The D version is about 7.7 times slower than the Java client version. I have seen that most running time in the D code is spent in this line that performs the BigInt sum:
>
> table[pos] += table[pos - 1];
>
> With some tests I think most of the run time of the D version is used by the GC, so this is mostly a GC benchmark (so here the sum routines of D BigInts are probably good enough). I think in this benchmark the low D GC performance is not caused by the not precise nature of the D GC (because during the running the amount of memory used is constantly 7.7 MB), but mostly by the amount of garbage produced each second. So maybe it's the Young Generation of the Java GC that is helping a lot here. But even the reference count GC of Python is more efficient here.
>
> I have not timed the GC performance with the recent experiments done by dsimcha on the D GC discussed here, a timing value with those changes is welcome:
> https://github.com/dsimcha/druntime/wiki/GC-Optimizations-Round-2

My optimizations make very little difference on this benchmark, but for 
good reason:  It's not a very good GC benchmark.  I ran it with my GC 
profiling code enabled and it only spends around 10% of its execution 
time in GC.  We need to figure out why else this benchmark may be so slow.

Furthermore, because this benchmark uses so little GC time, my 
improvements are mostly buried in noise.  Here are some sample runs on 
2.057 beta with and without my GC improvements.  However, I make no 
claim that these results are statistically significant, as when I ran 
the benchmark a few times the variance was quite high and I'm too lazy 
to run it a zillion times and get a mean and variance for each stage, etc.

Without my improvements:
992198221207406412424859964272600001
         Total GC prep time:  62 milliseconds
         Total mark time:  456 milliseconds
         Total sweep time:  1357 milliseconds
         Total page recovery time:  439 milliseconds
         Grand total GC time:  2314 milliseconds

Process returned 0 (0x0)   execution time : 20.776 s

With my improvements:
992198221207406412424859964272600001
         Total GC prep time:  16 milliseconds
         Total mark time:  393 milliseconds
         Total sweep time:  1049 milliseconds
         Total page recovery time:  297 milliseconds
         Grand total GC time:  1755 milliseconds

Process returned 0 (0x0)   execution time : 19.287 s


More information about the Digitalmars-d mailing list