Leave GC collection to the user of the D library?
Ali Çehreli
acehreli at yahoo.com
Sun May 9 03:25:06 UTC 2021
tl;dr I am scared of non-D programs calling my D library functions from
foreign threads. So, I am planning on asking the user to trigger
collection themselves by calling a collection function of my library. Crazy?
I've had serious issues bringing up a D library in a foreign
environment: Python modules loading a C++ library, which in turn uses
our D library. There were segmentation failures when loading this .so.
One workaround was to start the GC in disabled state with the following
global variable defined in the library.
extern(C) __gshared string[] rt_options = [ "gcopt=disable:1" ];
After that, the GC is enabled inside the library's initialization
function with the following command.
GC.enable();
That workaround seemed to be sufficient to load the library
successfully. Unfortunately, that was not enough to weed out all issues
related to libraries because this library itself loads other D
libraries. All of this caused sporadic issues. (My brain is too fried to
even remember what was a cause, what was a usable workaround, etc.
Sometimes I wasted days chasing a solution while using a test, which had
nothing to do with the solution. I would change the code, test, no go;
repeat, no go. It turns out, my test was unrelated. Argh!)
So, we came up with a drastic solution: Since all this code works just
fine in a pure D environment, make the library as thin as possible; the
library starts a daemon that is written in D with all the functionality.
The library merely dispatches the requests to that daemon.
The library starts the daemon with pipeProcess(); pipes are used for
dispatching requests and shared memory is used for large data. This idea
"worked like a charm." Phew!
However, dispatching of the requests to the daemon is performed by a
single library thread in a blocked manner: When a request is written to
the pipe, the response is read back (blocked) and the result is returned
to the user of the library function.
So now we want to use this functionality from multiple threads. Yikes! I
had so much trouble with foreign threads calling D libraries in the past
that I get scared. (In one case it was Java threads.) There are so many
dimensions to play with, hypothesizing a correct solution has been
exhausting. I was never sure whether the issues were e.g. with
threadAttachThis() or my misusing it.
Ok... How about this idea that would allow this library to be used from
multiple threads: Leave the GC disabled with that 'rt_options' variable
above and don't enable it in the library initialization function (this
is not init(); rather, a function that the user calls explicitly).
Instead, add yet another library API function for collecting garbage. I
can document that no other thread is allowed to call any other function
of the library when this collection function is called. They can do this
either at strategic points that they know no other thread is using the
library or they can use a mutex.
Another trivial function that I add can relay GC stats to the user so
that they can decide to call the GC if the allocations have been high
enough.
This would allow the user start as many foreign threads as possible.
Right? Is this sane? Is collection the only issue here? Do foreign
threads still need to call threadAttachThis()? What happens if they don't?
I feel so hopeless that in the past, I even thought about and
experimented with banning the user from starting threads on their own.
Rather, they would call my library on a posix compatible thread API and
create their threads through me, which happens to be a D thread, so no
thread would be a "foreign thread" and everything would work just fine.
I haven't deployed this crazy idea (yet).
Ali
More information about the Digitalmars-d
mailing list