(Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Mon Apr 13 11:54:57 PDT 2009


Sean Kelly wrote:
> Leandro Lucarella wrote:
>> But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
>> where I can get the TypeInfo in the first place =/
> 
> The call would have to be modified.  Right now the best you can do is 
> pass BlkAttr.NO_SCAN.  And storing a pointer per block could add a good 
> bit of bookkeeping overhead for small objects, of course.  Perhaps the 
> TypeInfo array could be converted to a bitmap or some such.

Let's see, you'd need 2 bits per pointer-sized block of bytes, to encode these 
possibilities:
a) Yeah, this is a pointer
b) Nope, not a pointer
c) Maybe a pointer (union, void[])
c2) (optional) A (somehow) explicitly pinned pointer (treated identical to (c) 
for GC purposes; needs to be followed during marking, but data pointed to can't 
be moved)
d) (optional, since we have a value left) This is a weak pointer

I'd split these up as such: One bit to indicate that it can be read as a pointer 
(and should thus be followed when marking, for instance) and one to indicate it 
can be written as a pointer (so it can be moved for (a) or nulled for (d)). That 
gives us these values for the two-bit field:
enum PtrBits {
     // Actual values
     JustData      = 0b00,
     MaybePointer  = 0b01,
     PinnedPointer = 0b01,
     WeakPointer   = 0b10,
     Pointer       = 0b11,

     // For '&' tests
     ReadableFlag  = 0b01,
     WritableFlag  = 0b10,
}

Like I said, this would cost 2 bits per pointer-sized chunk, so 1/16 of size for 
32-bit systems and 1/32th of the memory block size for 64-bit systems. It'd have 
to be rounded up to a whole number of bytes of course, and possibly T.alignof if 
stored at the start of the block. (Storing it at the end of the block would 
avoid that)

This could be bounded to one pointer worth of memory per block if the GC treats 
blocks > 16*4 = 64 bytes (on 32-bit systems) or > 32*8 = 256 bytes (on 64-bit 
systems) specially by just storing the raw TypeInfo reference instead of the 
bitfield for the memory block.
(Implementer's choice on what to do for (size_t.sizeof-1)*4*size_t.sizeof to 
size_t.sizeof^2 * 4 bytes, where the bit-encoded data takes up the same number 
of bytes as a pointer would)



More information about the Digitalmars-d mailing list