More recent work on GC

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Thu Jan 15 10:54:07 PST 2015


On Wed, Jan 14, 2015 at 08:07:37PM +0000, deadalnix via Digitalmars-d wrote:
> On Wednesday, 14 January 2015 at 18:01:22 UTC, H. S. Teoh via Digitalmars-d
> wrote:
> >Recently in one of my projects I found that I can gain a huge
> >performance improvement just by calling GC.disable() at the beginning
> >of the program and never calling GC.enable() again, but instead
> >manually calling GC.collect() at strategic points in the code.
> >Obviously, YMMV, but I managed to get a 40% performance improvement,
> >which is pretty big for such a relatively simple change.
> >
> 
> Interesting that you need to disable to get the effect. That mean our
> heuristic for the GC collection to kick in sucks quite badly.

Well, I'm not sure what the real cause is, but what happened was that I
was working on optimizing performance, and gprof indicated that a lot of
time was spent in the GC collection cycle. That led me to a lot of
needless GC allocations that, after I eliminated them, netted me a huge
performance boost. However, I noticed that there was still a lot of time
spent in the GC collection cycle -- less than before, but still a big
chunk of my running times. So as an experiment I decided to turn off the
GC completely to see what happens -- found that running times improved
by 40-50%, which is pretty huge!

Of course, that also meant I was leaking memory and the program was
soaking up too much RAM, so the second thought I had was to still run
the GC collection cycles, but at a much reduced frequency. This is
specific to my program's memory usage patterns (an ever-increasing
amount of allocations that remain live until the end of the program,
plus a comparatively much smaller number of temporary allocations that
need to get cleaned up every now and then to keep total memory use under
control); I'm not sure how generally applicable it is. In my particular
case, one of the major factors in poor GC performance was the increasing
bulk of allocations that are known to remain live until the end of the
program, that the GC must scan every collection cycle because it doesn't
know that most of them are going to remain live for a long time.
Consequently, collection cycles become slower and slower as the program
progresses, with most of the work being unnecessary since the growing
bulk of allocations aren't going away anytime soon.

This problem would be instantly solved by a generational GC, since after
a few cycles most of the bulk of the long-lived allocations will get
pushed to the oldest generations and the young collection cycles won't
be bogged down scanning them unnecessarily.

I'm not holding my breath for D to get a generational GC, though. :-P

Alternatively, since I already know exactly which allocations are going
to persist until the end, I could just use malloc instead. However this
is a bit annoying to implement since these allocations are coming from a
(very large) AA that I'm adding stuff to (nothing is ever removed). But
since I'm already working on replacing this AA with something else with
better cache-friendliness (and also disk-cacheability to transcend
current memory limitations), there's no point trying to improve AA
performance at this time.


T

-- 
Truth, Sir, is a cow which will give [skeptics] no more milk, and so they are gone to milk the bull. -- Sam. Johnson


More information about the Digitalmars-d mailing list