(Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

Robert Jacques sandford at jhu.edu
Mon Apr 13 18:18:21 PDT 2009


On Mon, 13 Apr 2009 14:54:57 -0400, Frits van Bommel  
<fvbommel at remwovexcapss.nl> wrote:

> Sean Kelly wrote:
>> Leandro Lucarella wrote:
>>> But right now gc_malloc() doesn't take any TypeInfo argument. I can't  
>>> see
>>> where I can get the TypeInfo in the first place =/
>>  The call would have to be modified.  Right now the best you can do is  
>> pass BlkAttr.NO_SCAN.  And storing a pointer per block could add a good  
>> bit of bookkeeping overhead for small objects, of course.  Perhaps the  
>> TypeInfo array could be converted to a bitmap or some such.
>
> Let's see, you'd need 2 bits per pointer-sized block of bytes, to encode  
> these possibilities:
> a) Yeah, this is a pointer
> b) Nope, not a pointer
> c) Maybe a pointer (union, void[])
> c2) (optional) A (somehow) explicitly pinned pointer (treated identical  
> to (c) for GC purposes; needs to be followed during marking, but data  
> pointed to can't be moved)
> d) (optional, since we have a value left) This is a weak pointer
>
> I'd split these up as such: One bit to indicate that it can be read as a  
> pointer (and should thus be followed when marking, for instance) and one  
> to indicate it can be written as a pointer (so it can be moved for (a)  
> or nulled for (d)). That gives us these values for the two-bit field:
> enum PtrBits {
>      // Actual values
>      JustData      = 0b00,
>      MaybePointer  = 0b01,
>      PinnedPointer = 0b01,
>      WeakPointer   = 0b10,
>      Pointer       = 0b11,
>
>      // For '&' tests
>      ReadableFlag  = 0b01,
>      WritableFlag  = 0b10,
> }
>
> Like I said, this would cost 2 bits per pointer-sized chunk, so 1/16 of  
> size for 32-bit systems and 1/32th of the memory block size for 64-bit  
> systems. It'd have to be rounded up to a whole number of bytes of  
> course, and possibly T.alignof if stored at the start of the block.  
> (Storing it at the end of the block would avoid that)
>
> This could be bounded to one pointer worth of memory per block if the GC  
> treats blocks > 16*4 = 64 bytes (on 32-bit systems) or > 32*8 = 256  
> bytes (on 64-bit systems) specially by just storing the raw TypeInfo  
> reference instead of the bitfield for the memory block.
> (Implementer's choice on what to do for  
> (size_t.sizeof-1)*4*size_t.sizeof to size_t.sizeof^2 * 4 bytes, where  
> the bit-encoded data takes up the same number of bytes as a pointer  
> would)

An alternative to this is to encode the information in ClassInfo and use  
it instead. (You'd have to create a fake ClassInfo for structs and  
arrays.) Then the GC only has to track the start of each object (i.e. the  
beginning of a block in the current GC). The advantage is that this has 0  
storage requirements for objects and on average < 4 bytes for structs and  
arrays (thanks to the coarse block sizes of the current GC).



More information about the Digitalmars-d mailing list