[D-runtime] Precise garbage collection

Rainer Schuetze r.sagitario at gmx.de
Sat Jun 22 02:22:32 PDT 2013


On 22.06.2013 09:11, Walter Bright wrote:
>
> On 6/21/2013 11:36 AM, Rainer Schuetze wrote:
>> In a nutshell, the implementation uses the RTInfo template to generate
>> a bitmap of booleans that indicate whether the corresponding field in
>> a struct/class might be a pointer or not. For built-in types, this
>> information is predefined in object.d and the typeinfo.ti_* modules.
>> When memory is allocated, the TypeInfo for the allocated type is
>> passed to the garbage collector, which copies the bitmap into memory
>> alongside the pages allocated by the pool (another GCBits member with
>> one bit per word).
>
> I assume you mean 1 bit per (void*).sizeof bytes.

Yes, I meant the "natural machine word" aka size_t.

>> I consider this behaviour a bug that should be fixed, as it also
>> disallows other usages of the RTInfo.
>
> I'm pretty sure it's a bug, too.

I have filed http://d.puremagic.com/issues/show_bug.cgi?id=10442

>
>>
>> There is also an issue with not creating RTInfo for associative array
>> types, but there is an "easy" workaround.

This seems to happen only with my patch to add more debug info and might 
be related to the bug above.

>>
>> b. there are already other application of the RTInfo template, so
>> there should probably be some way to combine multiple "generators". My
>> idea is to let RTInfo!T point to some immutable(RTInfoData) struct
>> that can then have multiple members for different generators. It'd be
>> nice to change the return type of TypeInfo.rtInfo() from void* to
>> immutable(RTInfoData)* to avoid casting.
>
> I don't think the casting is a problem, and I like the self-documenting
> aspect of void* making clear that the compiler doesn't know or care what
> that data actually is.

ok.

>[...]
>>
>> d. The alternative to using a pointer bitmap alongside the pool memory
>> is to store the TypeInfo pointer instead. It's major advantage is that
>> it adds minimal performance overhead to the allocation.
>
> But it'll subtract performance from the scanner. Which is better can
> only be determined with testing.
>
>>
>> d1. This needs more memory for small allocations, but less for larger.
>
> Storing the bitmap is a fixed 128/64 bytes for a page for 32/64 bit
> pointers. Notably for 64 bit pointers, it's only one pointer size. I
> think copying the bitmap is a net win, at least on size.

Don't want to split hairs, but it is the size of 8 64-bit-pointers for a 
page. Still it fits into a cache-line on x86 processors. I think there 
is also quite a bit of optimization potential regarding scanning this 
bitmap in combination with the mark and scan bitmaps. All of these are 
mostly evaluated bit per bit.

>
>
>> If it is stored in the same memory as the allocation itself, it should
>> be at the end to avoid alignment issues (at the beginning it always
>> adds 16 bytes). This would mostly reuse unused memory due to the
>> alignment of allocations to a power of 2. The worst effect would be
>> for allocations of just below or equal to a power of 2. We could
>> mitigate that effect by allowing other sizes of allocations aswell, or
>> by not reserving space for the TypeInfo pointer if the block is
>> allocated NOSCAN.
>
> I think it should be stored separately. Storing it with the allocation
> is inefficient, as (for example) only 2 bits are needed for a 16 byte
> alloc. Also, storing it with the allocation precludes "extending" an
> allocation in place if the next chunk is free.

I did not mean the bitmap here, but the alternative TypeInfo pointer. 
But you are right, extending becomes more complicated. Currently only 
page-sized allocation or larger are extendable, so if these allocations 
have the TypeInfo pointer at the beginning, that should work and would 
blend pretty well with the current array implementation (storing the 
length there aswell).

[...]
>> d3. These leads to the idea to generate a scanning function for each
>> type instead of the pointer bitmap. Depending on the sparseness of
>> pointers this can be shorter or can create a lot of code bloat. As a
>> compromise a generic version might use the pointer bitmap, but it can
>> be overloaded to implement arrays or even unions. On the downside, it
>> makes using std.emplace much more complicated if you want precise
>> scanning.
>
> I spent some time thinking about this a while back. While it is a very
> attractive idea, I suspected the performance would be terrible as it
> would require two or more indirect jumps per chunk. The scanning code
> all needs to be present, in the cache, and predictable for high speed
> scanning.

For data with sparse pointers the generated code might be more 
efficient, but I agree, in the general case, a double indirection for 
every small allocation could be expensive.


[...]
>> Summing up the questions raised:
>>
>> 1. Assuming RTInfo generation is fixed, should we go for adding an
>> indiretion through RTInfoData to allow multiple "generators"?
>
> No. Indirection makes it slower.

When using the copy-bitmap approach, this indirection is done only once 
when allocating, not during scanning. I think this is reasonable to 
allow extendibility. The default implementation might even make 
RTInfoData identical to the bitmap through some aliasing.

>
>>
>> 2. Are you ok with extending the gc_* function with TypeInfo
>> parameters where appropriate?
>
> Yes.
>
>>
>> 3. Do you think the pointer bitmap aside the pool memory and adding
>> gc_emplace is ok? Or should we investigate other alternatives first?
>>
>> 4. Would you object adding TypeInfo_Reference?
>
> Don't have a good answer for that.
>
>>
>> 5. How do you prefer enabling/disabling precise garbage collection?
>> Versioning/linking/runtime?
>
> We should pick precise collection and commit to it. And then add
> Leandro's concurrent collector on top!

Thanks for the feedback.



More information about the D-runtime mailing list