Short list with things to finish for D2

Thu Nov 19 11:19:01 PST 2009

dsimcha wrote:
> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s article
>> Steven Schveighoffer wrote:
>>> On Thu, 19 Nov 2009 12:01:25 -0500, dsimcha <dsimcha at yahoo.com> wrote:
>>>
>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s
>>>> article
>>>>> dsimcha wrote:
>>>>>> == Quote from Andrei Alexandrescu (SeeWebsiteForEmail at erdani.org)'s
>>>>> article
>>>>>>> Yes, it will be because the book has a few failing unittests. In
>>>>> fact, I
>>>>>>> was hoping I could talk you or David into doing it :o).
>>>>>>> Andrei
>>>>>> Unfortunately, I've come to hate the MRU idea because it would fail
>>>>> miserably for
>>>>>> large arrays.  I've explained this before, but not particularly
>>>>> thoroughly, so
>>>>>> I'll try to explain it more thoroughly here.  Let's say you have an
>>>>> array that
>>>>>> takes up more than half of the total memory you are using.  You try
>>>>> to append to
>>>>>> it and:
>>>>>>
>>>>>> 1.  The GC runs.  The MRU cache is therefore cleared.
>>>>>>
>>>>>> 2.  Your append succeeds, but the array is reallocated.
>>>>>>
>>>>>> 3.  You try to append again.  Now, because you have a huge piece of
>>>>> garbage that
>>>>>> you just created by reallocating on the last append, the GC needs
>>>>> to run again.
>>>>>> The MRU cache is cleared again.
>>>>>>
>>>>>> 4.  Goto 2.
>>>>> This is not a matter of principles, but one of implementation. When you
>>>>> GC, you can adjust the cache instead of clearing it.
>>>> Technically true, but what is a matter of principles is whether the
>>>> implementation
>>>> of arrays should be very tightly coupled to the implementation of the
>>>> GC.  Fixing
>>>> this issue would have massive ripple effects throughout the already
>>>> spaghetti
>>>> code-like GC, and might affect GC performance.  For every single
>>>> object the GC
>>>> freed, it would have to look through the MRU cache and remove it from
>>>> there if
>>>> present, too.
>>> You perform the lookup via MRU cache (after mark, before sweep).  I see
>>> it as a single function call at the right place in the GC.
>>>
>>>> The point is that this **can** be done, but we probably don't **want** to
>>>> introduce this kind of coupling, especially if we want our GC model to
>>>> be sane
>>>> enough that people might actually come along and write us a better GC
>>>> one day.
>>> What about implementing it as a hook "do this between mark and sweep"?
>>> Then it becomes decoupled from the GC.
>>>
>>> -Steve
>> I think these are great ideas, but you'd need to transport certain
>> information to the cache so it can adjust its pointers. Anyhow, I
>> believe this is worth exploring because it can help with a great many
>> other things such as weak pointers and similar checks and adjustments
>> (there was a paper on GC assertions that I don't have time to dig right
>> now. Aw what the heck, found it:
>> http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf
>> Andrei
> 
> The hook doesn't sound like a bad idea, but it raises a lot of issues with the
> implementation details.  These are things I could figure out given plenty of time.
>  I'd like weak refs, too.  However, I don't think this makes the short list for D2
> because:
> 
> 1.  Doing it at all properly requires a lot of thought about what a good design
> for such an API should be and how to implement it efficiently.
> 
> 2.  I think we still need an ArrayBuilder or something because, while the MRU
> would be reasonably efficient, it still wouldn't be as efficient as an
> ArrayBuilder, and would do nothing to solve the uniqueness problem.  Therefore, I
> think fleshing out ArrayBuilder is a higher priority.  I was thinking of a design
> something like this:
> 
> abstract class Array {
>     // A bunch of final methods for .length, opIndex, etc.
>     // No .ptr or opSlice.
> }
> 
> class UniqueArray : Array {
>    // Still no .ptr or opSlice.  Has .toImmutable, which allows
>    // for conversion to immutability iff the elements are either
>    // pure value types or themselves immutable.
>    //
>    // Also, can deterministically delete old arrays on reallocation,
>    // since it owns a unique reference, leading to more GC-efficient
>    // appending.
> }
> 
> class ArrayBuilder : Array {
>    // Add opSlice and .ptr.  Appending doesn't deterministically
>    // delete old arrays, even if the GC supports this.  No guarantees
>    // about uniqueness.
> }

What does .toImmutable return? As far as I can tell making UniqueArray a 
class can't work because by definition you give up controlling how many 
references to the array there could be in a program.

Andrei