"pause" garbage collection in parallel code

Stephan Schiffels via Digitalmars-d digitalmars-d at puremagic.com
Mon Dec 15 03:12:53 PST 2014


Dear all,

I have a parallel program, using std.parallelism (awesome!), but 
I recently noticed that I achieve very poor performance on many 
CPUs, and I identified the Garbage Collector to be the main cause 
of this. Because I have quite heavy memory usage, the Garbage 
collector interrupts all multi-threading while it runs, which 
reduces the total runtime of my program dramatically. This is so 
bad that I actually achieve poorer performance running on 20 
cores than on 4 cores.

I see several ways how to improve my code:
1.) Is there a way to tell the GC the maximum heap size allowed 
before it initiates a collection cycle? Cranking that up would 
cause fewer collection cycles and hence spend more time in my 
multithreaded code?
2.) Is there a way to "pause" the GC collection for the parallel 
part of my program, deliberately accepting higher memory usage?
3.) Most of the memory is used in one huge array, perhaps I 
should simply use malloc and free for that particular array to 
avoid the GC from running so often.

Certainly, Option 1 and 2 are "noninvasive", so preferred. Are 
the other ways?
I am a bit surprised that there is no command line option for dmd 
to control GC maximum heap size. Who determines how often the GC 
is run? For example, in Haskell I can simply set the maximum heap 
size to 10Mb in GHC using -A10m, which I used in the past to help 
exactly the same problem and dramatically reduce the frequency of 
GC collection cycles.

Thanks for help!
Stephan



More information about the Digitalmars-d mailing list