[D-runtime] Precise garbage collection

Mon Jun 24 23:18:06 PDT 2013

On 24.06.2013 20:59, Steven Schveighoffer wrote:
>
> On Jun 24, 2013, at 2:28 PM, Steven Schveighoffer
> <schveiguy at yahoo.com> wrote:
>
>> On Jun 24, 2013, at 2:04 PM, Rainer Schuetze <r.sagitario at gmx.de>
>> wrote:
>>
>>> On 24.06.2013 16:40, Steven Schveighoffer wrote:
>>>>
>>>> Here is a possibility for rectifying: any time you share
>>>> immutable data by reference, the compiler generates a call to
>>>> some 'makeImmutableArrayShared' function with the pointer, and
>>>> that removes the block's APPENDABLE bit (if part of the GC).
>>>> That would make any immutable shared arrays truly immutable,
>>>> and avoid this problem.
>>>>
>>>> Thoughts?  Is it possible to intercept the moment immutable
>>>> data becomes shared?
>>>
>>> I think you can make the length update lock- and race-free with a
>>> single CAS operation on the length, that pretty much does what is
>>> currently done in __setArrayAllocLength. You don't even have to
>>> care for problems like ABA, as the size always increases.
>>>
>>
>> This doesn't solve the problem that we cannot use thread-local
>> caching of blockinfos for immutable and const array types (the main
>> source of the speedup).
>
> I should say, we can cache block info lookups for everything that's
> under 1 PAGE, including shared, since those will never change size (I
> actually never thought of this before).  But we can't cache the
> lookup for >= 1PAGE because the block's length may change from call
> to call.
>
> Thinking about this some more, it may be I'm being too conservative.
>
> Consider that:
>
> 1. a block will only grow in size.  That is, a memory block (not the
> allocated space) will never be SMALLER than the cached size.  I don't
> think the append runtime will ever "shrink" a block's allocated
> length. 2. Because I store the allocated size at the front of larger
> blocks, I am guaranteed that no matter the block size, the location
> of the "stored data" is constant.
>
> So, I would like to do the following in order:
>
> 1. verify with memory gurus, that CAS in this case is safe to do.
> I'm super-worried about subtle memory issues (your ABA comment scares
> me)

I don't know if I qualify, but I think it is the job of the operations 
in core.atomic to handle the memory issues of the architecture, the CAS 
operation should be safe to use. Regarding ABA when shrinking: you are 
in trouble anyway if you shrink the array and append to to it at the 
same time from different threads. I recommend throwing an 
InvalidMemoryError whenever appending to an array if the actual 
allocation size is smaller than the array.

> 2. Submit pull request which ignores whether type is shared, and
> uses CAS to ensure integrity.  All blockinfo lookups, including ones
> to shared data, will go through the cache.

Sounds good. I think we should also look into making gc_query faster, 
e.g. I'm not sure it needs the global GC lock.

> 3. In general, I would
> like the runtime to be generated instead of defined by the compiler.
> To that end, I think it would be good to have the compiler change
> array append calls from runtime-only calls with TypeInfo as a
> parameter to runtime template calls that can possibly optimize.

That's what I was also hoping for. But I guess it will have to wait 
until a successful transformation of the associative arrays.