[D-runtime] Precise garbage collection

Leandro Lucarella leandro.lucarella at sociomantic.com
Fri Aug 9 02:55:41 PDT 2013


On Fri, Aug 09, 2013 at 10:47:33AM +0200, Rainer Schuetze wrote:
> On 08.08.2013 16:06, Leandro Lucarella wrote:
> >On Wed, Aug 07, 2013 at 10:42:25PM +0200, Rainer Schuetze wrote:
> >>It currently even fails to compile the druntime unittests because of
> >>the reported problems with the RTInfo generation. May it is better
> >>to let it fail in public ;-)
> >
> >:O
> >I didn't even try to compile it to be honest, just looked at the code,
> >are you saying is not really working as it is now? I'm sorry, I'm not
> >familiar with all the RTInfo stuff.
> 
> Unfortunately, for non-trivial programs it does not work out-of-the
> box, because you easily get link errors, mostly with respect to
> associative arrays. You can usually workaround these by adding
> aliases to the corresponding AssociativeArray(Key,Value) template
> somewhere. I recently managed to reduce it to something manageable:
> 
> http://d.puremagic.com/issues/show_bug.cgi?id=10786
> 
> The worse issue is silent non-generation of the RTInfo pointer,
> though the compiler replaces it with 0/1 that can be used for
> conservative scanning.
> 
> http://d.puremagic.com/issues/show_bug.cgi?id=10442

Oh! Bummer, I was under the impression that you had this working at some
point. 

> >>I also wanted to implement a different version for comparison, that
> >>stores the TypeInfo pointer in the allocation. I have a first
> >>attempt but it is quite a bit slower as it has to do some dynamic
> >>casting during scanning because of the way arrays are implemented.
> >>Some optimizations are needed to make an unbiased comparison, but
> >>other stuff distracted from doing this.
> >
> >The old precise scanning patch I used with the concurrent collector did
> >that and the overhead was considerable:
> >http://www.llucax.com.ar/blog/blog/post/098ceae8
> >(and http://www.llucax.com.ar/blog/blog/post/1490c03e also analyze the
> >performance but is a little confusing, I'm not sure what happened there
> >because what I said in the text is not really reflected in the graphs
> >:P).
> 
> Thanks for the links, I think I already saw them long ago but reread
> them now. I'm not sure we can blame the precise scanning for
> increasing the size of an allocation dramatically by adding one
> pointer, causing the allocations to use the next bin of twice the
> size. This only happens for allocations that happen to match the bin
> sizes (or are designed with these GC internals in mind), for others
> the overhead for adding the pointer is zero.

Yes, I know, but the values in the first post shows, at least for those
cases, it makes a difference. And I think for small sizes is very likely
that you have objects with the size as a bin (specially for 16 and 32).
When objects are larger, the chances of matching an exact bin size are
smaller.
Maybe an hybrid model would be the best approach.

And then, having to read the memory block could have bad cache effects,
specially for object that are NO_SCAN, which otherwise would be never
read from memory. Cache-wise I think bitmaps should behave better.

> In general we are already wasting 25% on average with these bin
> sizes that are a power of 2, we could mitigate the problem by adding
> some intermediate bin sizes.

True, but that's a different issue and not a justification to waste even
more space :P

> One test I've run to test performance of the GC is testgc3 from the
> test suite, which just adds entries to a uint[uint] associative
> array.
> A node in the hash list is 16 bytes for 32-bit processes, but 36
> bytes due to pessimistic alignment assumptions for 64-bit processes.
> This causes 64 byte allocations by the GC with horrible results for
> both memory and performance for the 64-bit version in comparison
> with the 32-bit version.

:S

-- 
Leandro Lucarella
Senior R&D Developer
Sociomantic Labs GmbH <http://www.sociomantic.com>


More information about the D-runtime mailing list