Type Aware GC Questions
Kevin Bealer
kevinbealer at gmail.com
Mon Jan 29 23:12:48 PST 2007
Sean Kelly wrote:
> Frank's suggestion seems reasonable. There's no way to pass a 'mask' to
> the GC for a specific block at the moment.
>
I think some GC designs have exactly that. I've thought about this and
it seems like two cases should cover most of it:
1. Short mask that repeats (to cover larger blocks).
2. Number that describes how much of the class contains pointers.
Item #1 would be a mask of each pointer-sized slice of memory. The
repeat part means that (for example) an array of struct can be
represented by one mask, overlaying the size of the struct.
All arrays of non-pointer primitives could be represented as '0' sz=1.
Arrays of arrays could be represented as '01' or '10', sz=2, depending
on whether the pointer comes first.
Small struct items (maybe under 16 words) could share the masks to
reduce duplication.
Item #2 allows allocations like this, which are common in C:
#define NUM 100
struct Foo {
int a;
int * b;
char * d;
char buf[NUM];
};
The "size" lets the 'buf' section not be represented in the mask,
however because there may be arrays of Foo, it would be good to combine
the mask-size and repeat-size elements to use short masks for long
objects and still repeat at long intervals.
A corollary of #2 is that the definitions of most classes could be
'sorted' to move pointy stuff toward the top. This would be 'soft' ie
some things cant be rearranged (internal struct members, arrays) even in
a class. It would have two nice effects:
1. Sorted classes will tend to use far fewer mask combinations.
2. GC runs can find all class pointers in fewer L1/L2 cache lines,
because the pointers are packed.
Both of these are a strict 'extension' of the current 'might have' bit,
since "might_have" represents '1', sz=1, and 'might_not' represents '0',
sz=1.
Automatic (and free) memory consistency checking (maybe):
Another neat idea --- you can introduce a 'must-have-pointer' marker for
the field mask, which means this field *must* be a valid pointer or null
(in debug mode only, or maybe only if enabled by the user?).
This would provide across the board pointer checking (a la valgrind) for
essentially zero costs (but only at GC time), because a conservative GC
must already be able to test if a pointer is within the allocated
heapspace in order to do its mark/sweep, right?
Either the pointer points to a valid object, or you get an assert. But
this would need to special case around deleted objects in case the
pointer aliasing is known-but-harmless.
Kevin
More information about the Digitalmars-d
mailing list