[D-runtime] Precise garbage collection
Rainer Schuetze
r.sagitario at gmx.de
Fri Jun 21 11:36:23 PDT 2013
Hi,
I want to make precise garbage collection as presented at the D
conference ready for inclusion into druntime. I have recently updated
the branch at https://github.com/rainers/druntime/tree/gcx_precise to
merge with Leandros changes to the GC module layout.
Before creating a pull request, I'd like to hear opinions on whether
this should be included, if other choices would be better and where it
should be improved.
Precise garbage collection must be able to work with different memory
areas, namely the heap, global/thread data in the binary and
stack/registers per thread. I think we do not have a feasible solution
for the latter pair, so let's focus on the former two.
1. Heap
In a nutshell, the implementation uses the RTInfo template to generate a
bitmap of booleans that indicate whether the corresponding field in a
struct/class might be a pointer or not. For built-in types, this
information is predefined in object.d and the typeinfo.ti_* modules.
When memory is allocated, the TypeInfo for the allocated type is passed
to the garbage collector, which copies the bitmap into memory alongside
the pages allocated by the pool (another GCBits member with one bit per
word). During a collection, scanning the heap can then look up this
information to detect false pointers and avoid keeping garbage alive.
There are a number of issues that should be discussed:
a. the compiler sometimes does not generate the RTInfo for a struct, but
instead generates 0/1 into the respective m_RTInfo field, depending on
whether this struct contains references or not. (As far as I can tell
this happens if it is only the backend that needs the TypeInfo, e.g.
when generating an array concatenation call.) That's why there are
rtinfoNoPointers/rtinfoHasPointers enums also used by the TypeInfos for
the builtin types.
I consider this behaviour a bug that should be fixed, as it also
disallows other usages of the RTInfo.
There is also an issue with not creating RTInfo for associative array
types, but there is an "easy" workaround.
b. there are already other application of the RTInfo template, so there
should probably be some way to combine multiple "generators". My idea is
to let RTInfo!T point to some immutable(RTInfoData) struct that can then
have multiple members for different generators. It'd be nice to change
the return type of TypeInfo.rtInfo() from void* to
immutable(RTInfoData)* to avoid casting.
c. The GC interface has to be extended to pass type information to
the GC. I guess just passing the respective TypeInfo pointer is obvious
and correct.
d. The alternative to using a pointer bitmap alongside the pool memory
is to store the TypeInfo pointer instead. It's major advantage is that
it adds minimal performance overhead to the allocation.
d1. This needs more memory for small allocations, but less for larger.
If it is stored in the same memory as the allocation itself, it should
be at the end to avoid alignment issues (at the beginning it always adds
16 bytes). This would mostly reuse unused memory due to the alignment of
allocations to a power of 2. The worst effect would be for allocations
of just below or equal to a power of 2. We could mitigate that effect by
allowing other sizes of allocations aswell, or by not reserving space
for the TypeInfo pointer if the block is allocated NOSCAN.
d2. Both dynamic and associative arrays currently allocate memory chunks
and use them in a "non-standard" way that cannot be described by a
simple TypeInfo. For example, dynamic arrays keep track of the allocated
size of the array by placing it at the very end of allocations < 4k, but
at the beginning for allocations >= 4k, moving the data to an offset of
16 in the latter case. Associative arrays combine hash-list-node, key
and value into a single allocation, sometimes even with the value type
unavailable.
My implementation solves these issues by "emplacing" the appropriate
type information at the given address, assuming pointers if no type
information is available (using the new gc_emplace function).
If only a TypeInfo pointer is available, I'm not sure how this can be
solved without _allocating_ a new TypeInfo. My best guess is that
it could be done if a generic array scan function could be called.
d3. These leads to the idea to generate a scanning function for each
type instead of the pointer bitmap. Depending on the sparseness of
pointers this can be shorter or can create a lot of code bloat. As a
compromise a generic version might use the pointer bitmap, but it can be
overloaded to implement arrays or even unions. On the downside, it makes
using std.emplace much more complicated if you want precise scanning.
e. Currently, there is only one TypeInfo_Class for both a class instance
and a class reference. There are currently assumptions made which one is
meant depending on context (if used as a "root" it is usually an
instance, when following TypeInfo.next it is assumed a reference). This
does no longer work reliably when combined with "emplace". I think we
need to add TypeInfo_Reference to describe the field that is a pointer
to a class instance. To be honest I have no idea how much code might
break by adding this indirection when traversing type information.
f. Currently, the precise GC can be versioned in/out, but I think making
it a runtime option is preferable. A problem here is that the user has
no way to change the default configuration before the GC is initialized.
I could imagine always starting with the precise GC, but the user can
opt out anytime. The GC could then release any additional memory it has
allocated for precise scanning. If the user opts back in, the data
structures are rebuilt assuming everything allocated so far as void[].
Another option might be to make it a link time option, but that would
mean the "standard" gc could not be part of the runtime library to be
exchangeable.
This text is already too long, so I postpone discussing the global/tls
data section until later.
Summing up the questions raised:
1. Assuming RTInfo generation is fixed, should we go for adding an
indiretion through RTInfoData to allow multiple "generators"?
2. Are you ok with extending the gc_* function with TypeInfo parameters
where appropriate?
3. Do you think the pointer bitmap aside the pool memory and adding
gc_emplace is ok? Or should we investigate other alternatives first?
4. Would you object adding TypeInfo_Reference?
5. How do you prefer enabling/disabling precise garbage collection?
Versioning/linking/runtime?
Best,
Rainer
More information about the D-runtime
mailing list