Garbage collector memory leak "feature"?

Sean Kelly sean at f4.ca
Wed Oct 10 07:28:12 PDT 2007


Steven Schveighoffer wrote:
> Something has come to light in a few posts that has described memory leak 
> problems because the garbage collector finds ghost pointers in hash data. 
> My understanding of the problem is that the garbage collector marks data 
> allocations as having pointers or not having pointers, and then searches 
> through all the data to see if there are any valid pointers, finding data 
> that looks like a pointer, but is not. (BTW, this seems inefficient to me, 
> but I have no idea how to write a garbage collector)
> 
> The most disturbing thing I've seen in these posts is the assumption that 
> this is just the way the garbage collector is, and shame on me for writing 
> code that fools it.
> 
> So I have some questions, coming from someone who knows nothing about 
> writing garbage collectors, but loves the usage of them.
> 
> 1. Is this just the way all garbage collectors are?  Is there not a way to 
> solve this problem?

Not all garbage collectors are like this.  Java, for example, uses an 
exact GC.  The accuracy of a GC is really a combination of the GC design 
and of the reflection facilities provided by the language.  D's 
reflection facilities are somewhat limited and will likely never be 
perfect because it is a systems programming language and does not run in 
a VM.

> 2. Does Tango have this problem?  I know the GC's are different, and I use 
> tango, so maybe I could just say, too bad for phobos users and be on my way 
> :)

Tango's GC was more accurate than the Phobos GC when Tango was 
announced, but Phobos has since caught up.  As things stand now, the two 
are roughly equivalent, though there are a few places in the Tango 
runtime which provide a bit more accuracy than Phobos (such as the AA code).

> 3. If it is solvable, is anyone working on this?  If not, they should be. 
> Add this to the list of things that need to be fixed before D has widespread 
> adoption.  Memory leaks == bad bad bad.

One slightly weird or confusing thing about D now is that void[] arrays 
are treated as if they contains pointers.  This is likely a significant 
source of "leaks."  However, treating void[] arrays as if they contain 
pointers is reasonable, given the concept that void represents.

The accuracy of garbage collection in D could be further improved by 
increasing the amount of type information available to the GC.  For 
example, if the GC knew /where/ in a memory block the pointers were, it 
could ignore the other parts.  However, this would also increase the 
memory used by the GC because it would have to store mask info or 
something comparable for allocated blocks.  It could also slow down 
collection in well-behaved (ie. lucky) apps because the collection 
algorithm would be a bit more complicated.

There are also other GC designs that could be used, but those would 
mostly affect the speed of garbage collection rather than its accuracy.


Sean



More information about the Digitalmars-d mailing list