A benchmark, mostly GC
dsimcha
dsimcha at yahoo.com
Sun Dec 11 07:23:25 PST 2011
On 12/11/2011 8:48 AM, bearophile wrote:
> This program used here as a benchmark is a bit reduced from a rosettacode task, it finds how many ways are there to make change for 100_000 euro (the argument 'amount' represents cents of euro) using the common coins.
>
> The result is:
> 992198221207406412424859964272600001
>
> The timings, best of 3, seconds:
> DMD: 22.5
> Python: 9.3
> Java: 2.9
>
> DMD 2.057beta, -O -release -inline
> Java SE 1.7.0_01-b08 (used without -server)
> Python 2.6.6
> 32 bit Windows system.
>
> The D version is about 7.7 times slower than the Java client version. I have seen that most running time in the D code is spent in this line that performs the BigInt sum:
>
> table[pos] += table[pos - 1];
>
> With some tests I think most of the run time of the D version is used by the GC, so this is mostly a GC benchmark (so here the sum routines of D BigInts are probably good enough). I think in this benchmark the low D GC performance is not caused by the not precise nature of the D GC (because during the running the amount of memory used is constantly 7.7 MB), but mostly by the amount of garbage produced each second. So maybe it's the Young Generation of the Java GC that is helping a lot here. But even the reference count GC of Python is more efficient here.
>
> I have not timed the GC performance with the recent experiments done by dsimcha on the D GC discussed here, a timing value with those changes is welcome:
> https://github.com/dsimcha/druntime/wiki/GC-Optimizations-Round-2
My optimizations make very little difference on this benchmark, but for
good reason: It's not a very good GC benchmark. I ran it with my GC
profiling code enabled and it only spends around 10% of its execution
time in GC. We need to figure out why else this benchmark may be so slow.
Furthermore, because this benchmark uses so little GC time, my
improvements are mostly buried in noise. Here are some sample runs on
2.057 beta with and without my GC improvements. However, I make no
claim that these results are statistically significant, as when I ran
the benchmark a few times the variance was quite high and I'm too lazy
to run it a zillion times and get a mean and variance for each stage, etc.
Without my improvements:
992198221207406412424859964272600001
Total GC prep time: 62 milliseconds
Total mark time: 456 milliseconds
Total sweep time: 1357 milliseconds
Total page recovery time: 439 milliseconds
Grand total GC time: 2314 milliseconds
Process returned 0 (0x0) execution time : 20.776 s
With my improvements:
992198221207406412424859964272600001
Total GC prep time: 16 milliseconds
Total mark time: 393 milliseconds
Total sweep time: 1049 milliseconds
Total page recovery time: 297 milliseconds
Grand total GC time: 1755 milliseconds
Process returned 0 (0x0) execution time : 19.287 s
More information about the Digitalmars-d
mailing list