[D-runtime] Precise garbage collection

Mon Aug 12 11:04:14 PDT 2013

On Aug 9, 2013, at 2:55 AM, Leandro Lucarella <leandro.lucarella at sociomantic.com> wrote:

> On Fri, Aug 09, 2013 at 10:47:33AM +0200, Rainer Schuetze wrote:
>> 
>> 
>> Thanks for the links, I think I already saw them long ago but reread
>> them now. I'm not sure we can blame the precise scanning for
>> increasing the size of an allocation dramatically by adding one
>> pointer, causing the allocations to use the next bin of twice the
>> size. This only happens for allocations that happen to match the bin
>> sizes (or are designed with these GC internals in mind), for others
>> the overhead for adding the pointer is zero.
> 
> Yes, I know, but the values in the first post shows, at least for those
> cases, it makes a difference. And I think for small sizes is very likely
> that you have objects with the size as a bin (specially for 16 and 32).
> When objects are larger, the chances of matching an exact bin size are
> smaller.
> Maybe an hybrid model would be the best approach.

There's a related issue where array length is stored within the allocated space for a chunk.  And since programmers tend to allocate buffers with sizes that are powers of two, we're probably doing them a disservice here too.  But I'm not sure how best to handle this situation.

Regarding precise scanning, it seems like we really want TypeInfo present for a number of different reasons.  The bit array, a finalizer, and so on.  What if we had a TypeInfo registry inside the GC code?  Then we could limit the size to store a TypeInfo reference for a block to some number of bytes for an index value.  I can't imagine needing more than 4 bytes for this, and maybe 2 would be sufficient?

Given what I said above, I think we have 4 cases to consider anyway:

1. The block is untyped.
2. The block is a class or struct.
3. The block is an array.
4. The block is an array of structs.

Case 1 doesn't need any metadata, cases 2 needs to store precise scanning information and possibly a finalizer pointer, case 3 needs to store length information, and case 4 needs to store data from both case 2 and 3.  Assuming that array length storage is better integrated into the GC (and I think an argument could be made for expecting the GC to remember the allocation length requested by the caller), how could we store these different combinations of data most efficiently?

> And then, having to read the memory block could have bad cache effects,
> specially for object that are NO_SCAN, which otherwise would be never
> read from memory. Cache-wise I think bitmaps should behave better.

As much as the idea of a virtual function inside TypeInfo for scanning a block sounds appealing from a code perspective, I imagine that the proper approach is to have a bitmap and a few flags in TypeInfo and to let the GC implement the scanning code itself.  Reducing things further so that we could get to the bitmap with a single pointer dereference sounds like it could be tricky though, if we're trying to save space per block.  And further still, we could try and do things in a way that avoided cache misses, though at that point I imagine we'd be tweaking runtime or compiler code.