draft proposal for ref counting in D

deadalnix deadalnix at gmail.com
Tue Oct 15 11:32:01 PDT 2013


On Tuesday, 15 October 2013 at 11:03:01 UTC, Michel Fortin wrote:
> On 2013-10-15 02:20:49 +0000, "deadalnix" <deadalnix at gmail.com> 
> said:
>
>> It will indeed cause trouble for code that mutate a large 
>> amount of shared pointers. I'd say that such code is probably 
>> asking for trouble in the first place, but as always, no 
>> silver bullet. I still think solution is the one that fit D 
>> the best.
>
> I think there's a small mistake in your phrasing, but it makes 
> a difference.
>
> When the collector is running, it needs to know about any 
> mutation for pointers to its shared memory pool, including 
> pointers that are themselves thread-local but point to shared 
> memory. So COW will be trouble for code that mutate a large 
> amount of **pages containing pointers to shared memory**. And 
> this which includes **pointers to immutable data** because 
> immutable is implicitly shared. And this includes **pointers to 
> const data** since those pointers might point to immutable 
> (thus shared) memory.
>

No, that is the beauty of it :D

Consider you have pointer from Tl -> shared -> immutable and TL 
-> immutable.

I'm not covering TL collection here (It seem to be obvious that 
it doesn't require to stop the world). So the starting point is 
that we have the roots in all TL heaps/stacks, and we want to 
collect shared/immutable without blocking the worlds.

TL heap may get new pointers to the shared heap, but they can 
only come from the shared heap itself or new allocations. At this 
point, you consider every new allocations as live.

Reading a pointer from the shared heap and copy it to the TL heap 
isn't problematic in itself, but then we have a problem if this 
pointer is now updated in the shared heap, as the GC may never 
scan this pointer.

This is why you need to track pointer writes to the shared heap. 
The write value itself isn't important : it come from either new 
alloc that are live, or from somewhere else in the shared heap 
(so it will be scanned as we track writes).

> So any memory page susceptible of containing pointers to shared 
> memory would need to use COW during collection. Which means all 
> the thread's stacks, and also all objects with a pointer to 
> shared, immutable, and const data. At this point I think it is 
> fair to approximate this to almost all memory that could 
> contain pointers.

No, only the shared one, that is the beauty of the technique. Not 
that I'm not making that up myself, it is how GC used to work in 
the Caml family for a while, and it has proven itself really 
efficient (in Caml family, most data are either immutable or 
thread local, and the shared heap typically small).


More information about the Digitalmars-d mailing list