Garbage Collection Idea

Daniel Keep daniel.keep.lists at gmail.com
Fri Jun 2 01:17:13 PDT 2006



Craig Black wrote:
> This is coming from someone who only understands garbage collection at an 
> elementary level, so anyone is welcome to correct me if I'm missing 
> something.
> 
> To mark valid pointers, the stack is scanned for root pointers.  The 
> allocation unit that each root pointer references is then scanned for 
> pointers, then the allocation units referenced by those pointers are 
> scanned, and the algorithm recurses.
> 
> IMO, this seems like unecessary memory scanning.  Couldn't we use a more 
> direct approach such that no scanning is required, and we just jump to the 
> location in memory where each pointer resides?
> 
> My first idea is in regard to root pointers, or pointers on the stack.  The 
> compiler knows when each pointer goes on the stack, so it could insert code 
> to keep track of all the pointers that are currently on the stack.  The 
> obvious data structure to use for this would be a stack.  When a pointer is 
> pushed on to the stack, the address of that pointer could be pushed on a 
> separate "pointer stack".  When that pointer goes out of scope, it's address 
> would be popped off of the pointer stack.  Then when the garbage collector 
> needs the root pointers, it would simply traverse the pointer stack, which 
> would no doubt be significantly smaller than the primary stack.
> 
> Then, instead of scanning each allocation unit for pointers, we could use a 
> more intelligent approach.  This approach requires preprocessing for each 
> class to generate a little reflection information.  Each class would include 
> an array of fields that are pointers.  Each element in the array would 
> simply be an address that would reference a pointer field that resides in 
> the class.  If the class doesn't contain any pointers, then this array will 
> be empty.
> 
> Once each class has an array of pointer fields as described above, we need 
> only determine the type of class for each pertinent allocation unit, and use 
> the pointer array for that class, rather than scanning the entire allocation 
> unit, which could be completely devoid of pointers.
> 
> If my thinking is correct, this approach would make garbage collection 
> significantly faster.
> 
> Thoughts?
> 
> -Craig 
> 

If what you propose could be done with no overheads, then yes it would
be faster.  The question is: what are those overheads?  Specifically,
how much larger will code be in order to maintain this pointer stack,
and how efficiently can it manage this stack?  The current
implementation has a head-start in that it requires no management code
(or at least, none that I'm aware of).

Another potential problem is this:

# class Foo { ... }
# Foo bar = new Foo;
# void* quxx = cast(void*)bar;

What happens now?  We can no longer tell what type of memory is pointed
to by quxx, but we *still* need to scan it.  The only way I can think of
to solve this would be to mark each segment of memory with the kind of
allocation (class, struct or otherwise), and what the type is.  That, or
you can just revert to blind scanning, but then it's no better than the
default GC (and more complex code-wise).

I'm not saying it's a bad idea; I assume the current implementation
exists as it does because it's a hell of a lot simpler to code, and it's
much easier to integrate.  I think the best thing to do would be to
actually code it; take a reasonably sized program, and convert it to use
a "smart" collector, and see what the difference is.

I wonder if .NET and Java use "smart" collectors.  I know that .NET uses
write barriers, which would suggest that it does.

I suppose that if nothing else, the advantage of this system would be
that it wouldn't mistake other fields for pointers, thus potentially
keeping objects alive when they're not longer actually referenced.

Anyway, it's something to think about.

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/



More information about the Digitalmars-d mailing list