More radical ideas about gc and reference counting
Rainer Schuetze via Digitalmars-d
digitalmars-d at puremagic.com
Sun May 11 11:18:38 PDT 2014
On 11.05.2014 10:22, Benjamin Thaut wrote:
> Am 10.05.2014 19:54, schrieb Andrei Alexandrescu:
>>
>>> The next sentence goes on to list the advantages of RC (issues we have
>>> wrestled with, like destructors), and then goes on to say the recent
>>> awesome RC is within 10% of "the fastest tracing collectors".
>>> Are you suggesting that D's GC is among 'the fastest tracing
>>> collectors'? Is such a GC possible in D?
>>
>> I believe it is.
>>
>
> While it might be possible to implement a good GC in D it would require
> major changes in the language and its librariers. In my opinion it would
> be way more work to implement a propper GC than to implement ARC.
>
> Every state of the art GC requires percise knowdelge of _all_ pointers.
> And thats exactly what we currently don't have in D.
I think most garbage collectors can work with a number of false
pointers. The referenced memory area has to be treated as pinned and
cannot be moved. Limiting the false pointers to stack and registers
seems like a compromise, though most of the stack could even be
annotated. Code for this does already exist in the debug info
generation, though I suspect stack tracing could be unreliable.
Here's my current stance on the GC discussions:
I agree that the current GC is pretty lame, even if it were precise.
"Stop-the-World" with complete tracing does not work for any interactive
application that uses more than a few hundred MB of garbage collected
memory (with or without soft-realtime requirements). Other applications
with larger allocation requirements are easily dominated by collection
time. Proposing to use manual memory management instead is admitting
failure to me.
For a reasonable GC I currently see 2 possible directions:
1. Use a scheme that takes a snapshot of the heap, stack and registers
at the moment of collection and do the actual collection in another
thread/process while the application can continue to run. This is the
way Leandro Lucarellas concurrent GC works
(http://dconf.org/2013/talks/lucarella.html), but it relies on "fork"
that doesn't exist on every OS/architecture. A manual copy of the memory
won't scale to very large memory, though it might be compressed to
possible pointers. Worst case it will need twice as much memory as the
current heap.
It would be very interesting how far we can push this model on the
supported platforms.
2. Change the compiler to emit (library defined) write barriers for
modifications of (possible) pointers. This will allow to experiment with
more sophisticated GC algorithms (if the write barrier is written in D,
we might also need pointers without barriers to implement it). I know
Walter is against this, and I am also not sure if this adds acceptable
overhead, but we don't have proof of the opposite, too.
As we all know, the usual eager reference counting with atomic
operations is not memory-safe, so my current favorite is "concurrent
buffered reference counting" (see chapter 18.2/3 "The garbage collection
handbook" by Richard Jones et al): reference count modifications are not
performed directly by the write barrier, but it just logs the operation
into a thread local buffer. This is then processed by a collector thread
which also detects cycles (only on candidates which had their reference
count decreased during the last cycle). Except for very large reference
chains this scales with the number of executed pointer modifications and
not with the heap size.
More information about the Digitalmars-d
mailing list