[D-runtime] Precise garbage collection
Steven Schveighoffer
schveiguy at yahoo.com
Mon Jun 24 11:59:41 PDT 2013
On Jun 24, 2013, at 2:28 PM, Steven Schveighoffer <schveiguy at yahoo.com> wrote:
> On Jun 24, 2013, at 2:04 PM, Rainer Schuetze <r.sagitario at gmx.de> wrote:
>
>> On 24.06.2013 16:40, Steven Schveighoffer wrote:
>>>
>>> Here is a possibility for rectifying: any time you share immutable
>>> data by reference, the compiler generates a call to some
>>> 'makeImmutableArrayShared' function with the pointer, and that
>>> removes the block's APPENDABLE bit (if part of the GC). That would
>>> make any immutable shared arrays truly immutable, and avoid this
>>> problem.
>>>
>>> Thoughts? Is it possible to intercept the moment immutable data
>>> becomes shared?
>>
>> I think you can make the length update lock- and race-free with a single CAS operation on the length, that pretty much does what is currently done in __setArrayAllocLength. You don't even have to care for problems like ABA, as the size always increases.
>>
>
> This doesn't solve the problem that we cannot use thread-local caching of blockinfos for immutable and const array types (the main source of the speedup).
I should say, we can cache block info lookups for everything that's under 1 PAGE, including shared, since those will never change size (I actually never thought of this before). But we can't cache the lookup for >= 1PAGE because the block's length may change from call to call.
Thinking about this some more, it may be I'm being too conservative.
Consider that:
1. a block will only grow in size. That is, a memory block (not the allocated space) will never be SMALLER than the cached size. I don't think the append runtime will ever "shrink" a block's allocated length.
2. Because I store the allocated size at the front of larger blocks, I am guaranteed that no matter the block size, the location of the "stored data" is constant.
So, I would like to do the following in order:
1. verify with memory gurus, that CAS in this case is safe to do. I'm super-worried about subtle memory issues (your ABA comment scares me)
2. Submit pull request which ignores whether type is shared, and uses CAS to ensure integrity. All blockinfo lookups, including ones to shared data, will go through the cache.
3. In general, I would like the runtime to be generated instead of defined by the compiler. To that end, I think it would be good to have the compiler change array append calls from runtime-only calls with TypeInfo as a parameter to runtime template calls that can possibly optimize.
4. Add special "unshared" appending calls that are called by the template when the type says it's mutable. Call those instead of the CAS version when the slice is mutable.
Can someone answer 1?
Also, does this all make sense to everyone?
-Steve
More information about the D-runtime
mailing list