how to use GC as a leak detector? i.e. get some help info from GC?

dsimcha dsimcha at yahoo.com
Sun May 24 16:17:09 PDT 2009


== Quote from nobody (no at where.com)'s article
> Hi,
> I'm writing a data processing program in D, which deals with large amounts of
> small objects. One of the thing I found is that D's GC is horribly slow in
> such situation. I tried my program with gc enable & disabled (with some manual
> deletes). The GC disabled version (2 min) is ~100 times faster than the GC
> enabled version (4 hours)!
> But of course the GC disabled version still leak memory, it soon exceeds the
> machine memory limit when I try to process more data; while the GC enabled
> version don't have such problem.
> So my plan is to use the GC disabled version with manual deletes. But it was
> very hard to find all the memory leaks. I'm wondering: is there anyway to use
> GC as a leak detector? can the GC enabled version give me some help
> information on which objects get collected, so I can manually delete them in
> my GC disabled version?  Thanks!

I've dealt with a bunch of somewhat similar situations in code I've written, here
are some tips that others have not already mentioned, and that might be less
drastic than going with fully manual memory management:

One thing you could try is disabling the GC (this really just disables automatic
running of the collector) and run it manually at points that you know make sense.
 For example, you could just insert a GC.collect() statement at the end of every
run of your main loop.

Another thing to try is avoiding appending to arrays.  If you know the length in
advance, you can get pretty good speedups by pre-allocating the array instead of
appending using the ~= operator.

You can safely delete specific objects manually even when the GC is enabled.  For
very large objects with trivial lifetimes, this is probably worth doing.  First of
all, the GC will run less frequently.  Secondly, D's GC is partially conservative,
meaning that occasionally memory will not be freed when it should be.  The
probability of this happening is proportional to the size of the memory block.

Lastly, I've been working on a generic second stack/mark-release allocator for D2,
called TempAlloc.  It's useful for when you need to temporarily allocate memory in
a last in, first out order, but you can't use the call stack for whatever reason.
 I've also implemented a few basic data structures (hash tables and hash sets)
that are specifically designed for this allocator.  Right now, it's coevolving
with my dstats statistics lib, but if you want to try it or at least look at it
and give me some feedback, I'd like to eventually get it to the point where it can
be added to Phobos and/or Tango.  See
http://svn.dsource.org/projects/dstats/docs/alloc.html .



More information about the Digitalmars-d mailing list