Threading and the Garbage handler mess.
Alan Knowles
alan at akbkhome.com
Sat Sep 6 08:10:48 PDT 2008
If anyone noticed this week I posted a few questions, and then a few bis
of code relating to threading and the GC.
Between tearing my hair out and generally verging on total madness, this
week I'm beginning to get to grips with the GC in a threaded environment.
Basically this statement is true for both phobos and tango (although
I've examined the phobos GC far more than tango)
-------------------------------------------------
"THE GC AS DELIVERED IS COMPLETELY BROKEN FOR REAL THREADED APPLICATONS,
and you should not try and write threaded applicaitons without expecting
to do a large amount of hacking to the GC"
-------------------------------------------------
Background - HOW THE GC WORKS:
-------------------------------------------------
let's first explain the basic ideas behind the Garbage collector:
Since most things D land are malloc'd via the GC - the GC knows all
about the memory you have allocated.. - It can also look at the stack
(currently running memory, eg. int/pointers that you have created in
your methods)..
Basically what it does is this:
- loop through all the memory it knows about, see's if any of it points
to any of the memory it has allocated..
- if the memory it has allocated has not been reference by the memory it
knows about.. then it's dead meat, and can be returned to the 'stack'
so it can be made available to any 'new' malloc() call.
There is quite a bit more in there, like a really nice design for
buckets/ and linked lists that default fill it... enabling quick
allocation depending on what size of memory is requested...
Anyway I hope you get the point..
The problems..
-------------------------------------------------
A) CROSS THREAD FREE's ARE EXTREMELY DANGEROUS
The current GC implementations free across threads both explicity with
delete, and implicitly with genCollect() which is just downright dangerous.
Consider this example - const char Gotcha
.......................................
extern (C) {
void * x = mylib_new();
void mylib_dosomething(void x* const char* z);
}
... d code ...
function risky_function(char[] a) {
mylib_dosomething(mylib, a.toStringz());
}
.......................................
In the above code I've left const char* in there, as that's what's in
the C headers, but is not actually in the D code.
What is critical about this code is that while mylib* x is active, and
has been passed 'char* z' in D, as in the risky_function() examples.
For the purpose of the garbage collecter which can only look at memory
it knows about to find pointers to 'char *z' z is ready to be
collected... This problem can easily crop up with the libc functions
that expect const char*'s and store data which another method may
queried later, and find has been overwritten with something completely
different (imagine function pointers!!)
the only workaround for this is to addRoot() and store the return value
to any 'toStringz()' call which is delivered into a const char* C call..
- even this may not help as you can accidentally delete these objects..
Normally all of this is not a common problem in a non-threaded
application, as you would probably just run the genCollect() after you
have finished dealing with mylib() - and everything will get cleaned up
nicely.
Unfortunatly with a multi-threaded application - if you run genCollect()
at the wrong time (eg. in another thread), you will destroy the char*
that was supposed to be constant...
-------------------------------------------------
B) NOT RUNNING genCollect() JUST WILL KILL MEMORY
as in the example above, you may think that turning off the GC and not
running getCollect(), they trying to free memory as you get it might be
a better idea.
Unfortunatly generally D is dependant and expects you to be running the
GC! a really good example is the toStringz() code.
what toStrings does is this
char* toStringz(char[] s) {
char[] copy = new char[s.length+1];
copy[0..s.length] = s;
copy[s.length] =0; // pad end with \0
return copy.ptr;
}
what char[] looks like under gdb is
struct { length = 123; ptr = "xxxxxxxxxx" }
the above code is probably doing a malloc() for the struct and a
malloc() for the ptr data, but leaves the struct hanging and expected to
be garbage collected, returning the ptr which is then sent on to a
external method, and might be expected to be constant. ***Well not 100%
sure about that behaviour, the struct may not be malloced, but just
allocated on the stack..***
There are many other examples in D of code where trying to manage and
clear memory explicitly is dificult / or near impossible (assoative
arrays for example) - and the expectation is that the garbage collecter
be used to sort it out..
The reality is that it's extremely difficult to be sure that you are not
leaking memory, there are no built in tools currently for testing this
(there is code in dsource = i've not tested it yet) , and it's kind of
goes against the principle of D, where you can for the most part throw
away all that memory allocation code, and concentrate on getting things
done..
-------------------------------------------------
POSSIBLE SOLUTION?
-------------------------------------------------
What I've hacked into the GC so far:
- getIdx() for std.thread.Thread exposing the private thread.idx ???<
how reliable is this??? - should be OK as long as you never delete a
thread....
- A GC Log like Array for associating each bit of malloc()'d memory is
owned by which thread.
- Checks in genCollect code to ensure that it does not try and free
other thread's memory.
- Warnings in free() when the program does exactly that..
- Information in LOGGING so that malloc can be traced to a line of code
(class/method etc.- backtrace_symbols does not appear to return line
numbers...) - see recent post
At this point running a threaded server under heavy load is alot more
stable than before. however as you can understand by looking at the
descriptions above. it still has some serious issues.
- Searching all the threads memory each time it wants to tidy memory a
little is horribly inefficient..
- There are still issues with the const char* that should be checked for..
ONGOING IDEAS
-------------------------------------------------
- One gcx pool for each thread.
This seems like the most sensible solution.. - it would mean that the
genCollect() run on a specific thread would only look at memory related
to that thread.
** Implications
+ memory allocated in one thread and used in another could not be
freed by that thread. -- need to think more about this.
++ if you pass a complex structure when starting a thread, you will
have to flag the root in the parent thread to ensure that it's not
freed by the starting thread.
++ the child thread will then need a way to flag so the starting
thread knows it can free it... -
++ making changes to a object in the child thread may make memory
allocation problematic.... -> the child thread does not know about the
parent?!?!
- rooting the results from getStringz() - and warning users!!!!!
- using strncpy and removing the need for the struct?? ( not sure if it
makes much difference - it's based on struct assumtion previously)
Suggested code for toStringz() = something like:
char* toStringz(char[] s) {
char *c = malloc(s.length+1;
strncpy(c, s.ptr, s.length);
c[s.length]=0;
gc.addRoot(c);
return c;
}
free should check Root() to see if it's registed there, to prevent it
being free'd by accident. (genCollect already does) an additional method
rootFree() - which basically does removeRoot() + free()...
Anyway it'd be interested in feedback.
Regards
Alan
More information about the Digitalmars-d
mailing list