Threads and GC

Sean Kelly sean at f4.ca
Fri Mar 17 12:51:33 PST 2006


Juan Jose Comellas wrote:
> I'm having a problem with the garbage collector when working with threads
> and DMD 0.149 on Linux. I'm currently writing an application to test some
> socket-related functionality and it's crashing whenever the garbage
> collector kicks in. 
> 
> I have two threads (one acting as server and the other one acting as
> client). Both threads are running tight loops processing messages from each
> other. In each of the iterations, a small amount of memory is used. At some
> point, the garbage collector is activated and the SIGUSR1 signal is sent to
> suspend all the other threads, and just after that I see a crash in the
> other thread.
> 
> From what I've seen of Phobos, when activating the garbage collector, the
> threads are suspended using the SIGUSR1 signal and are resumed with the
> SIGUSR2 signal. In my test I never see the SIGUSR2 signal being sent.
> 
> Has anybody else seen something like this before? It seems that Sean and
> Kris have found some problem with the GC too in Ares, but I haven't read
> their postings yet (dsource.org is down right now).

To sum up, Kris had encountered deadlock problems both with Phobos and 
with Ares.  I've since fixed Ares and have been trying to suss out the 
Phobos issues.  I've been focusing on the Win32 code up to now, and have 
found a potential resource leak with Phobos threads, but no sign of a 
potential deadlock yet.  But perhaps I should give the Posix code a look 
as well.

> In case anybody else finds the backtraces useful, I'm including what I could
> get using an unpatched gdb:
> 
> 
> Program received signal SIGUSR1, User defined signal 1.
> [Switching to Thread 1442708400 (LWP 8344)]
> 0x5557a84e in send () from /lib/tls/libpthread.so.0
> (gdb) bt
> #0  0x5557a84e in send () from /lib/tls/libpthread.so.0
> #1  0x08053641 in
> _D5mango2io6Socket6Socket4sendFAvE5mango2io6Socket6Socket5FlagsZi ()
> at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:1413
> #2  0x08052e70 in _D5mango2io6Socket6Socket6writerFAvZk ()
> at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:869
> #3  0x0804ef8e in _D5mango2io7Conduit7Conduit5writeFAvZk ()
> at /home/jcomellas/devel/d/mango_test/mango/io/Conduit.d:198
> #4  0x0805c881 in _D8selector16clientThreadFuncFZv () at selector.d:338
> #5  0x0805c776 in _D8selector21dummyClientThreadFuncFPvZi () at
> selector.d:308
> #6  0x08063213 in _D3std6thread6Thread3runFZi ()
> #7  0x08063557 in _D3std6thread6Thread11threadstartUPvZPv ()
> #8  0x55575cfd in start_thread () from /lib/tls/libpthread.so.0
> #9  0x5567913e in clone () from /lib/tls/libc.so.6
> (gdb) cont
> Continuing.
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 1433270496 (LWP 8341)]
> 0x080673b1 in _D3gcx3Gcx4markFPvPvZv ()
> (gdb) bt
> #0  0x080673b1 in _D3gcx3Gcx4markFPvPvZv ()
> #1  0x080675a8 in _D3gcx3Gcx11fullcollectFPvZk ()
> #2  0x0806746a in _D3gcx3Gcx16fullcollectshellFZk ()
> #3  0x080665bc in _D3gcx2GC12mallocNoSyncFkZPv ()
> #4  0x0806650c in _D3gcx2GC6mallocFkZPv ()
> #5  0x08062686 in _d_newclass ()
> #6  0x08056004 in
> _D5mango10containers7HashMap89__T7HashMapTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ7HashMap8iteratorFZC5mango10containers8Iterator101__T18MutableMapIteratorTT5mango2io5model8IConduit8IConduit6HandleTC5mango2io5model8IConduit8IConduitZ18MutableMapIterator
> ()
>    
> at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:303
> #7  0x08055439 in
> _D5mango2io8selector14SelectSelector18SelectSelectionSet7opApplyFDFKC5mango2io8selector5model9ISelector12SelectionKeyZiZi
> ()
>    
> at /home/jcomellas/devel/d/mango_test/mango/io/selector/SelectSelector.d:609
> #8  0x0805c5ca in
> _D8selector12testSelectorFC5mango2io8selector5model9ISelector9ISelectorZv
> () at selector.d:130
> #9  0x0805c349 in _Dmain () at selector.d:47
> #10 0x0805e52b in main ()

Hrm, so the GC thread blows up while trying to scan into pthread library 
code?  I don't see any reason for this to happen, so long as the stack 
range being passed to the GC is valid.  I know there are some library 
functions that are not considered cancelable, but I would think that 
they simply turn off signal handling for the span where that's true.


Sean


Sean



More information about the Digitalmars-d mailing list