Quantifying the Performance of Garbage Collection vs. Explicit Memory Management

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Wed Mar 19 14:41:02 PDT 2008


Sean Kelly wrote:
> == Quote from Frits van Bommel (fvbommel at REMwOVExCAPSs.nl)'s article
>> [1]: IIRC: When the OS indicates to their GC that it's about to swap
>> some of their pages out, it starts a collection and attempts to move the
>> live objects from multiple partially-used pages into fewer pages so the
>> now-free pages can be released to the OS to prevent the swapping from
>> taking place (or swap out the now-empty page they'll then avoid using,
>> I'm not sure exactly).
>> The cool thing is that they apparently managed to get that working
>> pretty well while mostly avoiding pages being swapped back in because of
>> collection activity.
> 
> Ah nice.  What I remember about the papers I read was that they simply scanned through pages being
> swapped out to bookmark any data being referenced.  However, it seems like doing things this way
> would require every pointer to be evaluated before being followed to avoid hitting an inactive page.

Yes, they'd need to locate the metadata for the memory block a pointer 
refers to. I remember it being stored in the first page of the 
'superpage' containing the block (a superpage is an aligned block of 2^N 
pages for some constant N, essentially a pool of memory blocks of a 
certain size with some metadata[1] in front of it).
They set it up so that they never allowed those first pages to be 
swapped out. Since superpages are aligned a pointer to a GC-ed object 
can be converted into a superpage pointer by simply masking out the 
lower bits, but they still need to verify that it is in fact a valid 
superpage (i.e. that it *is* a GC-ed object) which would require some 
sort of lookup (hashtable?) on that pointer. IIRC they used 16-page 
superpages, requiring one lookup table entry per 64 KiB allocated (minus 
superpage metadata).


[1]: IIRC they considered complete metadata (i.e. storing for each 
objects the number of references from swapped-out pages) to be too 
costly, so they instead implemented a partial scheme: A single reference 
count per page (or superpage, I'm not sure) and a bit "was this ever 
referenced from swap" per object, only clearing those bits when the 
reference counter for the containing (super?)page reached zero.
They then only allowed the GC to move objects when their bit was clear. 
(But they could be moved just before the bit was set, so they could be 
put into pages that already had such objects in them in an attempt to 
limit the number of such pages. This allows more choice for the GC when 
it comes time to decide which pages to empty when another swap-out is 
imminent)



More information about the Digitalmars-d mailing list