Introducing Sampling to the GC

Etienne Cimon via Digitalmars-d digitalmars-d at puremagic.com
Fri May 23 14:14:45 PDT 2014


I've made some benchmarks, and I have found that for every (costly) 
collection routine of the GC, about ~0.7% of an application's (GC page 
bin contents) used memory is actually freed (in the GC pages).

I made some tools to come up with those statistics, available with a 
patched druntime:

https://github.com/D-Programming-Language/druntime/pull/803

My proposal is to implement pointer sampling in the GC (using hypothesis 
testing - hypergeometric or poisson distributions) to tweak this 
collection efficiency. The idea would be to be able to specify how much 
% we'd like the GC to swipe on average at every cycle, so that these 
cycles run less frequently.

I'm still looking to challenge this idea with someone that is 
knowledgeable with probabilistic statistics and/or quality assurance. 
Does anyone think my time would be wasted if I added it? Would this 
collide with a semi-precise GC?


More information about the Digitalmars-d mailing list