Garbage collection, and practical strategies to avoid allocation

Manu turkeyman at gmail.com
Fri May 31 19:02:53 PDT 2013


So let's talk about garbage collection, and practical strategies to avoid
allocation.

GC related discussions come up basically every day, perhaps multiple times
a day on IRC, and the recent reddit 2.063 release thread is dominated by
C++ programmers who are keenly interested in D, but are scared by the GC.
I can say with confidence, as someone who has gone out on a limb and
actually invested a lot of time and energy in D, I'm as nervous (or more
so) as they are, and feel their turmoil deeply!

So where are we?

I can only speak from my industry's perspective. As I see it, we are here:
 - Stopping the world is unacceptable
 - Scanning the whole heap is costly
 - Also seems extremely wasteful to scan the whole heap when the overall
temporal stability of a realtime heap is extremely high (just a few temps
allocated frome-to-frame on average, and mostly on the stack!)
 - GC runs at unpredictable moments
 - GC collection cycles become more and more frequent the less unallocated
memory overhead you have. Hint: video games usually run within kb of the
systems available memory. How often will full heap-scanning collections be
issued to collect a couple of transient/temporary allocations when there is
only a few kb free memory? Conceivably, more than once per frame...
 - In a low-free-memory environment, what is the cumulative effect of
fragmentation? Can this be measured, or will it be a nasty surprise 2
months from shipping a 20-million dollar project? (Hint: a discovery of
this sort could very well ruin a project and destroy a company)

Basically nobody will have experienced these issues to their fullest extent
on a PC with plentiful memory. But they must be considered if the audience
still stuck in C++ is to take D seriously (who I predict are D's greatest
potential user-base).

While I do think a sufficiently advanced GC might satisfy the realtime
environment, the more I think about it, the more I am thinking a GC is not
applicable to the embedded (or memory limited) environment.

So what options exist?

I'm thinking more and more that I like the idea of a ref-counting GC.
People argue that managing ref-counts may be slower, perhaps true, it
requires a small amount of localised overhead, but if allocation frequency
is low, it shouldn't be much.
I think the key advantages though are:
 - determinism, memory will be immediately freed (or perhaps deferred by a
small but predictable amount of time, let's say, 1 frame)
 - elimination of full-heap scans which takes the most time
 - the refcount table is self-contained, won't destroy the dcache like a
heap scan
 - less tendency to fragment, since transient allocations can come into and
leave existence before something else allocates beside it

But this is only part of the problem.
Naturally, the fastest allocation is the one that never happened.
So an equally, or even more important aspect of the puzzle is offering
clearly documented advice, and convenient syntax/patterns on how to avoid
allocation in general.
It should be made EASY to avoid allocation. This way people will tend to do
it by habit, further alleviating the problem.

I've made the case that libraries should avoid surprise allocations at all
costs. Maybe this leads back to @nogc conversations (a concept I'm not
personally sold on), but something needs to be done, and
best-practises/conventions need to be defined.

So I'd go so far as to say, perhaps these 2 points should be considered as
key goals for 2.064?
  * find a solution for deterministic embedded garbage collection
  * decide some realistic best-practises/conventions for reliably (and
conveniently!) avoiding allocations

Showing progress on these will show real progress on D for ex-C++ users. In
my time using D, there has been exactly ZERO progress on the GC issue,
which is discouraging to say the least, perhaps even kinda scary.
(fortunately, people are thinking about it, and there were 2 great talks at
dconf, but I don't think either address the specific issues I raise)

Discuss... (or perhaps, "destroooy")
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130601/5bfe2e1b/attachment-0001.html>


More information about the Digitalmars-d mailing list