What's the go with the GC these days?

H. S. Teoh hsteoh at quickfur.ath.cx
Sun Jan 6 04:34:30 UTC 2019


On Sat, Jan 05, 2019 at 11:12:52PM +0000, Neia Neutuladh via Digitalmars-d wrote:
> On Sat, 05 Jan 2019 14:05:19 -0800, Manu wrote:
> > I'm somewhere between a light GC user and a @nogc user, and I don't
> > really know much about where we're at, or much about
> > start-of-the-art GC in general.
> 
> I use the GC unabashedly and only try to make sure I reuse memory when
> it's reasonably convenient. I've also looked into GC a bit.

I also use the GC freely, and only bother with GC optimization when my
profiler shows that there's an actual problem.

As I've said numerous times before, unless you're working on extremely
time-sensitive code like real-time applications or 3D game engines, D's
GC usually does not cause much noticeable difference. Unless you do
something extreme like allocate tens of millions of small strings (or
other small objects) per second, or allocate huge objects rapidly and
expect unreferenced memory to be quickly reused.

And when the GC does start becoming a problem, there are often easy
first-stab solutions that offer improvements that suffice for most
cases, the most obvious one being calling GC.disable and then scheduling
GC.collect yourself at more convenient times than when the default
system might run collections. This requires the least code changes, and
IME often yields quite good improvements.

Past that, the next most obvious one is to reduce GC pressure by reusing
frequently-allocated objects -- standard advice for improving GC
performance, y'know.  This requires a bit more work, but should be
obvious by identifying code hotspots with a profiler and examining which
objects are being most frequently allocated. Use stdx.allocator to
allocate from a pool instead, or just retain the object in a cache and
reuse it the next time round, etc..

If the GC is still an issue after this, look into preallocating objects
outside your main loop, so that you can control exactly when GC pauses
happen.  Again, standard advice for GC optimization.

Past this point, if GC remains a big issue, you could start pulling out
@nogc and using malloc/free, etc.. (Though I might do malloc/free in the
previous step if I know there are big objects I'm gonna need, and they
have straightforward lifetimes that are easy to track -- this reduces
the size of the GC heap, and thereby also improves collection
performance.)

//

As far as improving the GC itself is concerned, it will surely be nice,
and an overall win for D, and certainly we shouldn't delay on doing
this. But I don't think it's a life-or-death problem that we must fix
Right Here And Now Or Else(tm).


> > How much truth is in here?
> 
> D uses a simple GC. Simple means easy to reason about. It's also got
> better tools to stress the GC less.

Yeah, I find that with my recent, more idiomatic D code, I tend to use
ranges with lazy evaluation a lot more than plain arrays, meaning that
what I'd normally allocate as arrays in equivalent C/C++ code, in D no
allocation (or significantly less allocation) actually happens because
of lazy evaluation.  The one place where I still tend to allocate more
would be string manipulation, but even here, using slices instead of
copying substrings like in C/C++ still reduces the actual number of
allocations.

The situation is significantly different from GC-heavy languages like
Java, where basically every non-trivial type requires an allocation with
few or no alternatives or ways of bypassing it. The absence of by-value
aggregates in Java causes your average code to allocate far more than
the equivalent D code, not to mention the heavy OO emphasis often leads
to many indirections like interfaces and vtables, which often also
entail allocating adaptor objects, wrappers, etc..  So in Java, GC
performance would play a much larger role in overall performance,
whereas in D, the prevalence of by-value types like ranges, and passing
small parcels of information as structs rather than classes makes the GC
pressure much lower, so GC performance is not as significant.

(Of course, it's possible for Java compilers to optimize away some
allocations if object lifetimes can be statically determined -- I don't
know if any Java compilers actually do this. Still, Java code does tend
to be more allocation-heavy than typical D code.)


> But one thing it could get that would be interesting is a thread-local
> GC.

Yes, yes, and yes!  This would allow per-thread segregation of the GC
heap, which would allow much better control of GC pauses.


> (You'd have a stop-the-world phase that happened infrequently.
> Casting something to shared would pin it in the thread it was
> allocated from.)

How does this solve the problem of shared, though?  The last time I
checked, casting to/from shared is the main showstopper for a
thread-local GC.


[...]
> > Is progress possible, or is the hard reality that the language is
> > just designed such to be resistant to a quality GC, while the
> > ecosystem sadly tends to rely on it?
> 
> Three things about D make it harder to make a good GC for it:
> * unaligned pointers
> * unions
> * externally allocated memory (malloc and friends)
> 
> We've pretty much addressed malloc by telling people to manually add
> and remove malloced memory from what the GC scans. A union is pretty
> much just a pointer that might not be valid. Unaligned pointers just
> kind of suck.

Unaligned pointers are generally just a bad idea IMO.  I'm tempted to
say it should be defined as UB, along with obscured pointers (like the
doubly-linked list with 1 pointer per node trick using XOR). Maybe with
a compiler / runtime switch to enable a more conservative GC.

Now that I think of it, we could deal with pointers in unions the same
way -- if the compiler detects it, then trigger conservative mode in the
GC.

With these two out of the way, a generational GC for D seems closer to
the realm of possibility.


T

-- 
"How are you doing?" "Doing what?"


More information about the Digitalmars-d mailing list