Some questions about GC

Jonathan M Davis newsgroup.d at jmdavisprog.com
Sat Oct 19 05:02:26 UTC 2019


On Friday, October 18, 2019 10:54:55 AM MDT Roland Hadinger via Digitalmars-
d-learn wrote:
> These questions probably need some context: I'm working on an
> interpreter that will manage memory via reference counted struct
> types. To deal with the problem of strong reference cycles
> retaining memory indefinitely, weak references or recursive
> teardowns have to be used where appropriate.
>
> To help detect memory leaks from within the interpreter, I'd also
> like to employ core.memory.GC in the following fashion:
>
> * keep core.memory.GC off (disabled) by default, but nonetheless
> allocate objects from GC memory
> * provide a function that can find (and reclaim) retained
> unreachable object graphs that contain strong reference cycles
> * the main purpose of this function is to find and report such
> instances, not to reclaim memory. Retained graphs should be
> reported as warnings on stderr, so that the program can be fixed
> manually, e.g. by weakening some refs in the proper places
> * the function will rely on GC.collect to find unreachable objects
> * the function will *always* be called implicitly when a program
> terminates
> * the function should also be explicitly callable from any point
> within a program.
>
> Now my questions:
>
> Is it safe to assume that a call to GC.collect will be handled
> synchronously (and won't return early)?

D's GC is a stop-the-world GC. Every thread managed by the GC is stopped
when a thread runs a collection.

> Is there a way to ensure that GC.collect will never run unless
> when called explicitly (even in out of memory situations)?

The GC only runs a collection either when you explicitly tell it to or when
you try to allocate memory using the GC, and it determines that it should
run a collection. Disabling the GC normally prevents a collection from
running, though per the documentation, it sounds like it may still run if
the GC actually runs out of memory. I had thought that it prevented
collections completely, but that's not what the documentation says. I don't
know what the current implementation does.

> Is it possible and is it OK to print to stderr while the GC is
> collecting (e.g. from @nogc code, using functions from
> core.stdc.stdio)?

No code in any thread managed by the GC is run while a collection is running
unless it's code that's triggered by the collection itself (e.g. a finalizer
being called on an object that's being collected - and even that isn't
supposed to access GC-allocated objects, because the GC might have already
destroyed them - e.g. in the case of cycle). If you want code to run at the
same time as a GC collection, it's going to have to be in a thread that is
not attached to the GC, and at that point, you shouldn't be accessing
_anything_ that's managed by the GC unless you have a guarantee that what
you're accessing won't be collected. And even then, you shouldn't be
mutating any of it.

Also, @nogc doesn't say anything about whether the code accesses
GC-allocated objects. It just means that it's not allowed to access most GC
functions, which usually just means that it doesn't allocate anything using
the GC and that it doesn't risk running a collection. So, just because a
function is @nogc doesn't necessarily mean that it's safe to run it from a
thread that isn't managed by the GC while a collection is running.

> Could I implement my function by introducing a shared global flag
> which is set prior to calling GC.collect and reset afterwards, so
> that any destructor can determine whether has been invoked by a
> "flagged" call to GC.collect and act accordingly?

You should be able to do that, but then the destructor can't be pure (though
as I understand it, there's currently a compiler bug with pure destructors
anyway which causes them to not be called), and when a destructor is run as
a finalizer, it shouldn't be accessing any other GC-allocated objects,
because the GC might have actually destroyed them already at that point.
Finalizers really aren't supposed to doh much of anything other than
managing what lives in an object directly or managing non-GC-allocated
resources. Regardless, anything that really should be operating as a
destructor rather than a finalizer has to live on the stack, since
finalizers won't be run until a collection occurs. If you're explicitly
running them yourself via your own reference counting, then you don't have
that problem, but if there's any chance that a destructor is going to be run
as a finalizer by the GC, then you have to write your destructors /
finalizers with the idea that that could happen.

> Alternatively: do I need to implement such a flag, or is there
> already a way in which a destructor can determine whether it has
> been invoked by the GC?
>
> Thanks for any help!

Honestly, the way things are set up, destructors aren't supposed to know or
care about whether they're being run by the GC as a finalizer. So, the GC
isn't going to provide that kind of functionality. What you're looking to do
is pretty much a giant hack from the perspective of the GC and likely to be
pretty dangerous to attempt. I suspect that what would make a lot more sense
would be to create a custom build of druntime to run which specifically
printed out what wasn't freed when the program shut down rather than trying
to hack around how the GC works. Alternatively, you could just ditch the GC
entirely and then use valgrind to see what didn't get freed to catch cycles
(or other screw-ups that resulted in memory not being freed).

Having the GC take care of cycles for you isn't necessarily a problem, but
having the GC report on what's alive or not is tricky business, particularly
since it's supposed to keep anything that the program still has access to
alive.

Another thing to consider is that some language features outright require
the GC (e.g. closures and anything with dynamic arrays involving
allocation), and if you truly don't want to use the GC for that stuff, it's
probably going to be easier to require that your program not use the GC at
all than to try to have it just manage cycles.

Regardless, if you really want to go forward with something like you're
proposing here, you'll probably need to get answers from one of the few GC
experts around here.

- Jonathan M Davis





More information about the Digitalmars-d-learn mailing list