GC interpreting integer values as pointers

Ivo Kasiuk i.kasiuk at gmx.de
Thu Oct 14 10:35:13 PDT 2010


> On Sat, 09 Oct 2010 15:51:37 -0400, Ivo Kasiuk <i.kasiuk at gmx.de> wrote:
> 
> > Hi!
> >
> > In my D programs I am having problems with objects not getting finalised
> > although there is no reference anymore. It turned out that this is
> > caused by integers which happen to have values corresponding to pointers
> > into the heap. So I wrote a test program to check the GC behaviour
> > concerning integer values:
> >
> 
> [snip]
> 
> > So in most but not all situations the integer value keeps the object
> > from getting finalised. This observation corresponds to the effects I
> > saw in my programs.
> >
> > I find this rather unfortunate. Is this known, documented behaviour? In
> > a typical program there are such integer values all over the place. How
> > should such values be stored to avoid unwanted interaction with the GC?
> 
> Yes, D's garbage collector is a conservative garbage collector.  One which  
> doesn't have this problem is called a precise garbage collector.
> 
> There are two problems here.  First, D has unions, so it is impossible for  
> the GC to determine if a union contains an integer or a pointer.
> 
> Second problem is the granularity of scanning.  A memory block is scanned  
> as if every n bits (n being your architecture) is a pointer, or there are  
> no pointers.  This is determined by a bit associated with the block (the  
> NO_SCAN bit).
> 
> If you allocate a memory block that contains at least one pointer, then  
> all the words in the memory block are considered to be pointers by the  
> GC.  There is a (continually updated) patch which allows the GC to be  
> semi-precise.  That is, the type information of the memory block will be  
> linked to it.  This will allow precise scanning except for unions.  Once  
> this is integrated, the false pointer problem will be much less prevalent.
> 
> -Steve

Thanks! This absolutely makes sense. It is basically a trade-off between
precision and efficiency of the GC.
Slowly, I am learning all the little details of D's garbage collection.
It is more complicated than it seems at first, but understanding it
better greatly helps to write better programs in terms of memory
management.

There is one case though that I am still not sure about: associative
arrays. It seems that keys as well as values in AAs are scanned for
pointers even if both are integer types. How can I tell the GC that I do
not want them to be scanned? I know about the NO_SCAN flag but what
memory region should it be applied to in this case?

BTW: considering the "conservative" scanning, the implementation of
Object.toHash() is somewhat interesting:

hash_t toHash()
{
  // BUG: this prevents a compacting GC from working, needs to be fixed
  return cast(hash_t)cast(void*)this;
}

So an object's hash value will keep the GC from freeing the object, if
that value is scanned. But as the comment indicates, this implementation
needs to be changed anyway (I am eager to see the result). A compacting
GC probably gives rise to some whole new problems.

Ivo




More information about the Digitalmars-d-learn mailing list