GCs in the news

w0rp via Digitalmars-d digitalmars-d at puremagic.com
Thu Jul 17 05:37:09 PDT 2014


The key to making D's GC acceptable lies in two factors I believe.

1. Improve the implementation enough so that you will only be 
impacted by GC in extermely low memory or real time environments.
2. Defer allocation more and more by using ranges and algorithms 
more, and trust that compiler optimisations will make these fast.

The big, big offender I believe for extra allocations is 
functions which return objects, rather than functions which write 
to output ranges. The single most common occurence of this is 
something like this is toString. Instead of writing this...

string toString() {
     // Allocations the user of the library has no control over.
     return foo.toString() ~ bar.toString() ~ " something else";
}

I believe you should always, always instead write this.

// I left out the part with different character types.
void writeString(OutputRange)(OutputRange outputRange)
if (isOutputRange!(OutputRange, char)) {
     // Allocations controlle by the user of the library,
     // this template could appear in a @nogc function.
     foo.writeString(outputRange);
     bar.writeString(outputRange);

     "something else".copy(outputRange);
}

It's perhaps strange at first because you're pre-programmed from 
other languages, except maybe C++ which uses output streams, to 
always be allocating temporary objects everywhere, even if all 
you are doing is writing them to an object.

For improving the GC to an acceptable level, I believe collection 
only needs to execute fast enough such that it will fit within a 
frame comfortably. So for something rendering at 60FPS you have 1 
second / 60 frames ~= 16.6 milliseconds of computation you can do 
without resulting in a single dropped frame. That means you need 
to get collection down to something in the 1ms to 2ms region. At 
which point collection time will only impact something which is 
really pushing the hardware, which would exclude most mobile 
video games, which are about the complexity of Angry Birds.

I firmly believe there's no silver bullet for automatic memory 
management. Reference counting solutions, including automatic 
reference counting, will consume less memory than a garbage 
collector and offer more predictable collection times, but do so 
at the expense of memory safety and simplicity. You need fatter 
pointers to manage the reference counts, and you need to 
carefully deal with reference cycles.

In addition, you cannot easily share slices of memory with 
reference counting, which is an advantage of garbage collection. 
With GC, you can allocate a string, slice a part of it, hand over 
the slice to some other object, and you know that the slice will 
stay around for as long as it's needed. With reference counting, 
you have to either retain the slice and the whole segment in the 
same way and allow for the possibility of hidden cycles, or 
disallow slicing and create copies instead. Slicing in GC is 
important, because you can create much more efficient programs 
which take slices based on regex, which we do right now.

For the environments which cannot tolerate collection whatsoever, 
like Sociomantic's real time bidding operations, then control of 
allocation will have to be left to the user. This is where the 
zero allocation idea behind ranges and algorithms comes into 
play, because then the code which doesn't allocate, which could 
potentially be all of std.algorithm, can still be used in those 
environments, rather than being rendered unusable.

There's my thoughts on it anyway. I probably rambled on too long.


More information about the Digitalmars-d mailing list