State of and plans for the garbage collector

Sun Jul 18 04:51:08 PDT 2010

On Thu, 15 Jul 2010 00:18:38 -0700, Jonathan M Davis wrote:

> Okay. I really don't know much about garbage collectors, how they work,
> or what makes one particularly good or bad

Me too.

But its a central part of the D langauge, its design and runtime.
Even though D programmmers can replace allocators and do manual memory 
management, I would like to know if anyone has tried to do this in a big 
way.

Without GC, programming with the DPL would be a different beast.

It would seem to me that type information, GC design, memory allocation 
schemes are by dependence necessarily integrated together to try to get 
efficient performance. There must be loads of academic research projects 
done on this already, that would show what the trade-offs are.

The first thing I notice is that memory blocks allocated by GC are
either marked as either needing to be checked or are not checked at all.  
As far as I know there is no futher refining meta data, for instance a 
list of 32/64 bit offsets which can be excluded or included for being 
pointers to data.  Such a list would be of the form

[offset-array length, exclusive/inclusive], offset1, offset2, offsetn.

Its slower code to walk through such list than the method of the current 
GC, which from my remembrance of an earlier newsgroup post, just bit-ands 
every value to see if looks like a memory aligned address or not.

Having the option of a more careful check may resolve the residual cases.

There is a trade off between the amount of information available to the 
GC, and how effective it will be.  More information also means a cost of 
attaching another pointer for the information to each memory block type,
with implications on how the memory blocks might be pooled.

I read that the .NET garbage collector uses much more complete metadata 
stored in the assembly to know what part of objects contain addresses.
I am sure its a state of the art GC, but I know nothing of its guts.

So the GC depends on how the D compiler may compile in a accessible 
memory descriptor  it to each typeinfo, type allocation pool, and 
integrate this into the runtime allocation design and memory block 
scanning.

This gives the GC an option of resolving the harder cases of false 
pointers for objects not released by the conservative scan which does not 
use type information. 

Without the type information, the GC and memory allocation experimenter 
is limited to a reduced subset of models and algorithms, and so the 
possibility of having a choice of something better than the Boehm 
conservative collector (I think it actually can use some form of meta 
data), is restricted.

-->= -->=