[D-runtime] Precise garbage collection

Steven Schveighoffer schveiguy at yahoo.com
Sat Jun 22 15:48:46 PDT 2013


On Jun 22, 2013, at 2:49 AM, Rainer Schuetze <r.sagitario at gmx.de> wrote:
> 
> 
>> 
>> This can be a problem if you have two threads looking at the block at
>> the same time.  One can get the block info, release the GC lock, then
>> the other could extend the block.  By the time the first thread comes
>> back to look at the "end" of the block (which is checked while
>> holding a different lock), the block info is no longer valid, and it
>> would look at the wrong place.  I think for unshared blocks, there
>> would be no problem, but block can become shared/unshared via a cast,
>> and that would cause problems.
> 
> I was a bit surprised to find the special casing for shared/non-shared arrays in the array appending code, given the fact that the absence of "shared" does not guarantee it is not shared. I'm not sure whether we have to still guarantee memory safety in the case of undeclared sharing, but I'd be a bit more comfortable if we could (if it doesn't involve a global lock). I don't have a better solution, though.

The only way the absence of "shared" allows shared data is with __gshared fields or casting.  This is a "you better know what you are doing" storage class, and so is not handled properly in the runtime.  Note that the reason appending is palatable is due to the thread-local storage cache for the last 8 appended-to arrays.  If this wasn't present, the update to the runtime that prevented stomping would be less performant than before.

The shared version attempts to do the best it can, and sacrifices performance for correctness (the obvious choice).

However, __gshared is not detectable, so you should never try to append to a __gshared array from more than one thread.  If you did, the threads would use their thread-local caches, and have possibly different cached typeinfo values.  Not to mention the race condition for writing the 'used' size :)

If you cast away shared, but don't ensure the data becomes actually unshared, you are in bad territory too.

On Jun 22, 2013, at 5:22 AM, Rainer Schuetze <r.sagitario at gmx.de> wrote:

> On 22.06.2013 09:11, Walter Bright wrote:
>> I think it should be stored separately. Storing it with the allocation
>> is inefficient, as (for example) only 2 bits are needed for a 16 byte
>> alloc. Also, storing it with the allocation precludes "extending" an
>> allocation in place if the next chunk is free.
> 
> I did not mean the bitmap here, but the alternative TypeInfo pointer. But you are right, extending becomes more complicated. Currently only page-sized allocation or larger are extendable, so if these allocations have the TypeInfo pointer at the beginning, that should work and would blend pretty well with the current array implementation (storing the length there aswell).

I agree, if you store simply a pointer to a typeinfo, you can use the first 16 bytes, just like the array appending does.  In fact, that makes things more uniform.

Note, you can have a mix of solutions -- for up to 16*sizeof(void*) bytes allocations, store bitmaps.   For everything else, use a typeinfo.  Both solutions could try off-page or in-block.  You can use the same philosophy as array append code -- for non-extendable GC blocks (i.e. < PAGE size) place at the end, for PAGE size and above, place at the front.

I would encourage an implementation of all options (if you are so inclined), and see which one, or which mix, performs better.  It may be an easy call at that point.

-Steve


More information about the D-runtime mailing list