Idiomatic D using GC as a library writer
Adam D Ruppe
destructionator at gmail.com
Sun Dec 4 23:25:34 UTC 2022
On Sunday, 4 December 2022 at 22:46:52 UTC, Ali Çehreli wrote:
> That's way beyond my pay grade. Explain please. :)
The reason that the GC stops threads right now is to ensure that
something doesn't change in the middle of its analysis.
Consider for example, the GC scans address 0 - 1000 and finds
nothing. Then a running thread moves a reference from memory
address 2200 down to address 800 while the GC is scanning
1000-2000.
Then the GC scans 2000-3000, where the object used to be, but it
isn't there anymore... and the GC has no clue it needs to scan
address 800 again. It, never having seen the object, thinks the
object is just dead and frees it.
Then the thread tries to use the object, leading to a crash.
The current implementation prevents this by stopping all threads.
If nothing is running, nothing can move objects around while the
GC is trying to find them.
But, actually stopping everything requires 1) the GC knows which
threads are there and has a way to stop them and 2) is overkill!
All it really needs to do is prevent certain operations that
might change the GC's analysis while it is running, like what
happened in the example. It isn't important to stop numeric work,
that won't change the GC. It isn't important to stop pointer
reads (well not in D's gc anyway, there's some that do need to
stop this) so it doesn't need to stop them either.
Since what the GC cares about are pointer locations, it is
possible to hook that specifically, which we call write barriers;
they either block pointer writes or at least notify the GC about
them. (And btw not all pointer writes need to be blocked either,
just ones that would point to a different memory block. So things
like slice iterations can also be allowed to continue. More on my
blog
http://dpldocs.info/this-week-in-d/Blog.Posted_2022_10_31.html#thoughts-on-pointer-barriers )
So what happens then:
GC scans address 0 - 1000 and finds nothing.
Then a running thread moves a reference from memory address 2200
down to address 800... which would trigger the write barrier. The
thread isn't allowed to complete this operation until the GC is
done. Notice that the GC didn't have to know about this thread
ahead of time, since the running thread is responsible for
communicating its intentions to the GC as it happens.
(Essentially, the GC holds a mutex and all pointer writes in
generated D code are synchronized on it, but there's various
implementations.)
Then the GC scans 2000-3000, and the object is still there since
the write is paused! It doesn't free it.
The GC finishes its work and releases the barriers. The thread
now resumes and finishes the move, with the object still alive
and well. No crash.
This would be a concurrent GC, not stopping threads that are
doing self-contained work, but it would also be more compatible
with external threads, since no matter what the thread, it'd use
that gc mutex barrier.
More information about the Digitalmars-d-learn
mailing list