Short list with things to finish for D2

Thu Nov 19 09:30:52 PST 2009

Steven Schveighoffer wrote:
> On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha at yahoo.com> wrote:
> 
>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s 
>> article
>>> dsimcha wrote:
>>> > == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s 
>>> article
>>> >> Yes, it will be because the book has a few failing unittests. In 
>>> fact, I
>>> >> was hoping I could talk you or David into doing it :o).
>>> >> Andrei
>>> >
>>> > Unfortunately, I've come to hate the MRU idea because it would fail 
>>> miserably for
>>> > large arrays.  I've explained this before, but not particularly 
>>> thoroughly, so
>>> > I'll try to explain it more thoroughly here.  Let's say you have an 
>>> array that
>>> > takes up more than half of the total memory you are using.  You try 
>>> to append to
>>> > it and:
>>> >
>>> > 1.  The GC runs.  The MRU cache is therefore cleared.
>>> >
>>> > 2.  Your append succeeds, but the array is reallocated.
>>> >
>>> > 3.  You try to append again.  Now, because you have a huge piece of 
>>> garbage that
>>> > you just created by reallocating on the last append, the GC needs 
>>> to run again.
>>> > The MRU cache is cleared again.
>>> >
>>> > 4.  Goto 2.
>>> This is not a matter of principles, but one of implementation. When you
>>> GC, you can adjust the cache instead of clearing it.
>>
>> Technically true, but what is a matter of principles is whether the 
>> implementation
>> of arrays should be very tightly coupled to the implementation of the 
>> GC.  Fixing
>> this issue would have massive ripple effects throughout the already 
>> spaghetti
>> code-like GC, and might affect GC performance.  For every single 
>> object the GC
>> freed, it would have to look through the MRU cache and remove it from 
>> there if
>> present, too.
> 
> You perform the lookup via MRU cache (after mark, before sweep).  I see 
> it as a single function call at the right place in the GC.
> 
>> The point is that this **can** be done, but we probably don't **want** to
>> introduce this kind of coupling, especially if we want our GC model to 
>> be sane
>> enough that people might actually come along and write us a better GC 
>> one day.
> 
> What about implementing it as a hook "do this between mark and sweep"?  
> Then it becomes decoupled from the GC.
> 
> -Steve

I think these are great ideas, but you'd need to transport certain 
information to the cache so it can adjust its pointers. Anyhow, I 
believe this is worth exploring because it can help with a great many 
other things such as weak pointers and similar checks and adjustments 
(there was a paper on GC assertions that I don't have time to dig right 
now. Aw what the heck, found it:

http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf

Andrei