Transitioning to a type aware Garbage Collector
Frits van Bommel
fvbommel at REMwOVExCAPSs.nl
Tue Jan 23 12:33:41 PST 2007
Sean Kelly wrote:
> Paul Findlay wrote:
>> Walter Bright wrote:
>>> To improve GC performance, we need to transition to a GC that is
>>> aware of the types of what it is allocating.
>> Does this mean the D garbage collector will be an exact (or precise)
>> garbage collector?
>
> If a block contains pointers then the entire block will still be scanned
> instead of just the pointers in that block, so It will still be a
> conservative collector but with fewer false positives.
I thought that by "a GC that is aware of the types of what it is
allocating" Walter meant a (mostly) precise GC.
In most cases, the locations of pointers can be deduced from the type,
as long as the compiler generates appropriate RTTI. The exceptions (that
I'm aware of) are unions with overlapping pointers and non-pointers, and
void[]s (deliberately).
Even types of variables in stack frames can be determined I think, as
long as you format them in the standard way (return eip & previous ebp
at the start). Then it's basically a linked list:
-----
struct StackFrame
{
// local variables are *before* this data (stack grows down)
StackFrame* ebp; // pointer to next StackFrame
size_t eip; // return address
}
-----
If you do a lookup on eip you should be able to determine what variables
(or at least their "pointerness") the stack frame contains, with some
help from compiler-generated metadata. Unknown stack frames (and data
between the defined ranges of two consecutive stackframes) should be
considered ambiguous.
Again, ambiguous union members and (static) void arrays are exceptions.
An alternative to eip-based lookup would be to push an additional value
at the beginning of each function call, but that would
(a) break with non-D functions
and
(b) cost performance outside of the GC (even if the GC is disabled)
> I think this
> could be changed so that just the pointers were scanned, but there would
> likely be a noticeable performance hit for doing so.
I'm not so sure about that "noticeable performance hit" you speak of. I
think that would depend on the format of the RTTI and the relative
amounts of pointers & non-pointers. (the latter is application-specific,
by the way...)
To be sure about this we'd need to run some benchmarks and see how it
fares in practice...
> If false positives
> are still a concern (and I'm not convinced they will be), a better
> approach may be to pursue a different GC design such as the CBGC
> mentioned yesterday.
False positives are a concern for some people; look at the thread "The
problem with the D GC" in digitalmars.D[1]. I'm pretty sure you've seen
it already. For one thing, it contains about 10 of your posts ;).
Note: I haven't read that article on CBGC yet, but I plan to.
[1]: digitalmars.D:46407,
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=46407
More information about the Digitalmars-d-announce
mailing list