[D-runtime] Metadata for memory

Fri Jul 30 06:53:07 PDT 2010

People are working diligently on incorporating precise scanning into the GC.

I have added safe array appending.

All of these require metadata associated with the memory block to assist the 
runtime in correctly performing it's tasks.  So far, the solution has been to 
eat a small piece of the data block allocated.  For precise scanning, it's a 
bitmask or pointer-to-bitmask to indicate which fields are pointers.  For safe 
array appending, it's a length field indicating the "used" length of the block.

One of the problems the developers working on precise scanning have pointed out 
is that we are both eating up the memory block, and there is the potential that 
we are both trying to use the same part of the memory block, unaware of the 
other's usage.  Also, in large blocks I store the array length at the beginning 
of the block, which can screw up the precise scanning, since the length can 
offset the alignment of the bitmask.

I think it's probably time to think about how we can allocate metadata in an 
orderly fashion.

The requirements are that:

1. the overhead is as low as possible.  Array appending uses 1 byte for all 
blocks <= 256 bytes.  A bitmask for a small block (16 bytes) only needs 3-4 
bits, but may consume 4 bytes if a pointer to a bitmask is required. 

2. alignment of data is preserved.  This is done mostly via putting the data at 
the end.  But storing outside the block is also a possibility.
3. Individual Metadata does not conflict with others.

I think the GC is going to have to be aware of the array appending, since it 
must be aware of the length field (if stored int he block), and it could 
potentially save many cycles only scanning 'used' data.

On storing the array append length at the beginning of large blocks:  A large 
block can be extended in place, meaning that the block size (not just the used 
size) can increase.  This means the end of the block can move.  With my current 
array append code, for shared array appends, you would have to lock the entire 
append function to ensure this didn't happen in the middle of the append.  I 
query for the block size once, and if it changed after I do the query, the 
length would be screwed up.  This might be a necessary evil but who is appending 
large amounts of data to shared blocks anyways?  So biting the shared bullet and 
locking the entire shared append function might take care of the issues of 
storing at the beginning (scanning skipping the length, bitmask offset by n 
bytes).  The GC is going to have to be aware of moving the bitmask anyways.

I think the most funky situation is really the small blocks (16 or 32 bytes) 
where you want overhead to be small, but you need to store enough info.  With a 
16-byte block, I think we can squeeze both the bitmask and length into one 
byte.  Essentially 4 bits for length, and 3 bits for a bitmask (the 4th word is 
incomplete, so only 3 bits are needed).  For a 32 byte block, we would need 2 
bytes, one for the 7-bits bitmask, and one for the length.  At 64 bytes, we can 
have 16 bits for bitmask, and 1 byte for length.  At 128 bytes and above, a 
bitmask would consist of 32 bits anyways, so might as well start using a pointer 
(may be different for 64-bit CPUs), and still only a byte is needed for length.

The order of storage should always be arraylength, bitmask to keep alignment 
proper, and allow the most usage of space.  And depending on the bits set on the 
block (APPENDABLE, NO_SCAN), one or both of these may be omitted.

We also should get bugzilla 4487 addressed, allocating an "array" whenever a 
single struct is desired is wasteful.

Anyone have any other ideas?  Any other points to make?

-Steve