Current State of the GC?

Jonathan M Davis via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Thu Oct 13 04:55:50 PDT 2016


On Monday, October 10, 2016 21:12:42 Martin Lundgren via Digitalmars-d-learn 
wrote:
> I've been reading up a bit on the D garbage collector. Seen
> mostly negative things about it. I've also seen a lot of
> proposals and what not, but not much about the current state of
> things.
>
> The latest page I can find about it is 2015H1. It mentions
> improving the GC and making libraries less reliant on it.
> However, I can't find *any* information about what GC
> improvements have been made. No up to date performance
> comparisons, etc.
>
> So what's been happening in memory management land lately? Bad GC
> seems like one of the Dlangs weak points, so showing improvements
> here could definitely bring more people in.

The GC has had various improvements made to it over the last couple of
years, but the folks doing it haven't really be advertising what they've
been up to, so without digging through the commit logs and figuring out what
they did, I can't tell you what the improvements are. Martin Nowak _was_
going to do a talk on some of that at dconf 2015, but he missed his flight,
and the talk never happened.

Improvements towards marking stuff @nogc where appropriate in druntime and
Phobos are slowly coming along, but there's still plenty of work to do
there. There's also been a fair bit of work towards taking functions that
result in strings and creating alternate versions which result in lazy
ranges so that they don't have to allocate. std.experimental.allocator is in
place now, paving the way for a lot of stuff not using the GC. There are all
kinds of small things being done, incrementally moving towards not using the
GC when it's not actually required to do what the function is doing. But
some classes of things are always going to use the GC. And some stuff will
need some language improvements in order to not need the GC (e.g. exceptions
pretty much require the GC as it stands; it's possible to use them without
the GC but incredibly unsafe, because there is no standard mechanism in
place for handling their memory other than the GC; the result is that pretty
much anything using exceptions right now can't be @nogc even if it doesn't
use the GC for anything but exceptions).

Because it was determined that stuff like std.typecons.RefCounted can't
actually be done in an @safe manner, Walter has done some work towards
adding @safe refererence counting to the language for the cases where that
makes more sense than the GC (and that may or may not help fix the problem
with requiring the GC for exceptions). But in order to do that, he's been
doing a lot of work towards improving @safe in general, and who knows when
the ref-counting stuff will actually arrive.

So, various improvements have been made and continue to be made which
improve the GC, or reduce the need for the GC (or simply reduce the need for
heap allocation in general), or which provide alternatives to using the GC.
But there's still plenty to be done.

It's also possible to completely disable use of the GC in D, but you lose
out on a few features (and while the std lib doesn't use the GC heavily, it
does use it, so if you remove the GC from the runtime, you can't use
Phobos), so it's not particularly advisable. But you can get a _long_ way
just by being smart about your GC use. A number of D programmers have
managed to use D with full use of the GC in high performance code simply by
doing stuff like make sure that a collection cycle doesn't kick in in hot
spots in the program (e.g. by calling GC.disable when entering the hot spot
and then GC.enable when leaving), and for programs that need to do real-time
stuff that can't afford to have a particular thread be stopped by a GC
collection, you just use a thread that's not managed by the GC for that
critical thread, and it's able to keep going even if the rest of the program
is temporarily stopped by a collection.

The reality of the matter though is that for the most part, the problem with
the GC and D is primarily a PR issue and not a practical one. A lot of folks
from C/C++ land freak out when they see that a GC is being used and just
assume that there are major efficiency problems. It _is_ true that if you
allocate stuff on the heap heavily and churn through objects such that you
keep getting the garbage collector to kick in, it's going to hurt the
performance of your program, but so is lots of allocating and deallocating
of heap objects in general, even if a GC isn't involved at all. But
idiomatic D doesn't use the heap anywhere near as much as many languages
tend to (e.g. structs on the stack are used much more heavily than classes
on the heap and lazy ranges are a huge win at avoiding a lot of heap
allocations; and while dynamic arrays do normally use the GC heap, the fact
that they can be sliced instead of having to be copied is a huge performance
win). So, while you _can_ get yourself in trouble with the GC, the vast
majority of programs really have no problem with it at all. And certain
classes of programs are actually faster using a GC than something like
reference counting (especially if a collection cycle is never actually
required).

So, while work is being done to make sure that the std lib doesn't use the
GC when it doesn't need to and to better enable idioms that don't require
the GC, the current situtation actually works quite well. But some folks
just don't like the very idea of a GC and assume that the performance must
be terrible. Or they hear about cases where it was terrible and think that
that's the norm, when it isn't. Some code really needs to care about the GC
and avoid it as much as possible, and some code allocates so much stuff that
it totally shoots itself in the foot, but most code works just fine with the
GC.

- Jonathan M Davis



More information about the Digitalmars-d-learn mailing list