More embarrassing microbenchmars for D's GC.

bearophile bearophileHUGS at lycos.com
Thu Sep 16 04:27:57 PDT 2010


Rounin:

> Thank you for that advice. I'm using GDC because it's available from Ubuntu
> Linux's package system, whereas DMD currently is not. (And the .deb posted on
> digitalmars.com is only for i386.) Hopefully, D will gain more popularity and more
> up-to-date tools will be made available.

On a 32 bit system I find LDC to produce efficient programs, when possible (the GC is similar, in D2 it's essentially the same).


> By the way, today I re-compiled the program with a "std.gc.enable;" right before
> the final "return 0" statement, and it still runs in 0.68 seconds.

You may try to disable/enable the GC in the Python code too (on 32 bit systems there's the very good Psyco too).

Your benchmark is not portable, so I can't help you find where the performance problem is. When you perform a benchmark it's better to give all source code and all data too, to allow others to reproduce the results and look for performance problems.

Keep in mind that D associative arrays are usually slower than Python dicts. Probably you build data structures like associative arrays, and this slows down the GC. If you disable&enable the GC around that build phase, the program is probably fast (so I suggest you to narrow as much as possible the width of the disable/enable span, so you may see where the GC problem is). If you put a exit(0) at the end of the program (to kill final collection) the D program may save more time.

In Python 2.7 they have added a GC optimization that I may be used in D too:
http://bugs.python.org/issue4074
>The garbage collector now performs better for one common usage pattern: when many objects are being allocated without deallocating any of them. This would previously take quadratic time for garbage collection, but now the number of full garbage collections is reduced as the number of objects on the heap grows. The new logic only performs a full garbage collection pass when the middle generation has been collected 10 times and when the number of survivor objects from the middle generation exceeds 10% of the number of objects in the oldest generation. (Suggested by Martin von Löwis and implemented by Antoine Pitrou; issue 4074.)<


>while D's splitlines() wasn't quite working.<
>int space = std.regexp.find(line, r"\s");<

What do you mean? If there's a bug in splitlines() or split() it's better to add it to Bugzilla, possibly with inlined string to split (no external file to read). splitlines() or split() are simple functions of a module, written in D, so if there's a problem it's usually not too much hard to fix it, they are not built-in methods written in C as in CPython.


if(!(path in oldpaths) && !(checksum in oldsums))
In D2 this may be written (unfortunately there is no "and" keyword, Walter doesn't get its usefulness yet):
if (path !in oldpaths && checksum !in oldsums)

Bye,
bearophile


More information about the Digitalmars-d mailing list