More radical ideas about gc and reference counting

Tue May 13 07:13:31 PDT 2014

On Tue, 13 May 2014 02:12:44 -0400, Rainer Schuetze <r.sagitario at gmx.de>  
wrote:

>
>
> On 13.05.2014 00:15, Martin Nowak wrote:
>> On 05/11/2014 08:18 PM, Rainer Schuetze wrote:
>>>
>>> 1. Use a scheme that takes a snapshot of the heap, stack and registers
>>> at the moment of collection and do the actual collection in another
>>> thread/process while the application can continue to run. This is the
>>> way Leandro Lucarellas concurrent GC works
>>> (http://dconf.org/2013/talks/lucarella.html), but it relies on "fork"
>>> that doesn't exist on every OS/architecture. A manual copy of the  
>>> memory
>>> won't scale to very large memory, though it might be compressed to
>>> possible pointers. Worst case it will need twice as much memory as the
>>> current heap.
>>
>> There is a problem with this scheme, copy-on-write is extremely
>> expensive when a mutation happens. That's one page fault (context
>> switch) + copying a whole page + mapping the new page.
>
> I agree that this might be critical, but it is a one time cost per page.  
> It seems unrealistic to do this with user mode exceptions, but the OS  
> should have this optimized pretty well.
>
>  > It's much worse
>  > with huge pages (2MB page size).
>
> How common are huge pages nowadays?

I know this is coming from a position of extreme ignorance, but why do we  
have to do copy on write? What about pause on write? In other words, if a  
thread tries to write to a page that's being used by the collector, it  
pauses the thread until the page is no longer being used by the GC.

This doesn't fix everything, but it's at least as good as today's GC,  
which preemptively pauses threads.

My ignorance is that I have no idea if this is even possible, and I also  
have no idea how it would affect performance.

-Steve