Multithreading woes on Linux
Dave
Dave_member at pathlink.com
Sun Apr 23 14:39:08 PDT 2006
I just ran into this - the fix in std/thread.d:
extern (C) static void pauseHandler(int sig)
{ int result;
// Save all registers on the stack so they'll be scanned by the GC
asm
{
pusha ;
}
assert(sig == SIGUSR1);
// Move sem_post to after t.stackTop = getESP();
//sem_post(&flagSuspend);
sigset_t sigmask;
result = sigfillset(&sigmask);
assert(result == 0);
result = sigdelset(&sigmask, SIGUSR2);
assert(result == 0);
Thread t = getThis();
t.stackTop = getESP();
t.flags &= ~1;
sem_post(&flagSuspend); // HERE
while (1)
{
sigsuspend(&sigmask); // suspend until SIGUSR2
if (t.flags & 1) // ensure it was resumeHandler()
break;
}
// Restore all registers
asm
{
popa ;
}
}
The problem is that the t.stackTop is not valid when it is passed into
gcx.mark() because it is being munged as pauseAll returns (and lets the
GC commence) before the stackTop is set for all of the paused threads.
Please give it a try and if it also solves your problem then it will be
a confirmed fix.
- Dave
Juan Jose Comellas wrote:
> It seems that there is a problem in the code generated by DMD or the code in
> Phobos when using multithreading on Linux. I've been trying several ways of
> rewriting my programs to avoid this problem, but I've had no success so
> far. The crashes always happen inside the garbage collector. The line
> reported by gdb is:
>
> #0 0x0806a978 in _D3gcx3Gcx4markFPvPvZv () at gcx.d:1318
> 1318 byte *p = cast(byte *)(*p1);
>
> It looks like the pointer that's being dereferenced by the GC is invalid.
> I've added checks before this line to see if it was a NULL pointer and it's
> not. Surprisingly (or not), my program crashes almost immediately if Phobos
> and the GC are compiled with optimizations. If I only leave "-g" as the
> DFLAGS in the makefiles I get these crashes much less frequently.
>
> In the test program I'm using I have two threads. The crash is happening on
> thread 1. The full backtrace I get for the crash is attached to this post.
>
> I'm trying to write a simplified sample program and I'll post it once I have
> it ready. Walter, if you have a minute, I'd appreciate you looking into
> this.
>
>
> ------------------------------------------------------------------------
>
> (gdb) thread apply all bt
>
> Thread 2 (process 8953):
> #0 0x5557db9d in sem_post at GLIBC_2.0 () from /lib/tls/libpthread.so.0
> #1 0x08062f27 in _D3std6thread6Thread12pauseHandlerUiZv () at std/thread.d:940
> #2 <signal handler called>
> #3 0x5557e83e in send () from /lib/tls/libpthread.so.0
> #4 0x08050a61 in _D5mango2io6Socket6Socket4sendFAvE5mango2io6Socket6Socket5FlagsZi () at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:1423
> #5 0x08050290 in _D5mango2io6Socket6Socket6writerFAvZk () at /home/jcomellas/devel/d/mango_test/mango/io/Socket.d:879
> #6 0x0804cbde in _D5mango2io7Conduit7Conduit5writeFAvZk () at /home/jcomellas/devel/d/mango_test/mango/io/Conduit.d:198
> #7 0x0805821f in _D8selector16clientThreadFuncFZv () at selector.d:363
> #8 0x0805816e in _D8selector21dummyClientThreadFuncFPvZi () at selector.d:327
> #9 0x080628c5 in _D3std6thread6Thread3runFZi () at std/thread.d:609
> #10 0x08062d50 in _D3std6thread6Thread11threadstartUPvZPv () at std/thread.d:845
> #11 0x55579ced in start_thread () from /lib/tls/libpthread.so.0
> #12 0x5567ddde in clone () from /lib/tls/libc.so.6
>
> Thread 1 (process 8949):
> #0 0x0806a978 in _D3gcx3Gcx4markFPvPvZv () at gcx.d:1318
> #1 0x0806ad05 in _D3gcx3Gcx11fullcollectFPvZk () at gcx.d:1462
> #2 0x0806aab5 in _D3gcx3Gcx16fullcollectshellFZk () at gcx.d:1382
> #3 0x080692de in _D3gcx2GC12mallocNoSyncFkZPv () at gcx.d:275
> #4 0x080691c1 in _D3gcx2GC6mallocFkZPv () at gcx.d:228
> #5 0x080684db in _d_newclass () at gc.d:127
> #6 0x08053df7 in _D5mango2io8selector12PollSelector12PollSelector11selectedSetFZC5mango2io8selector5model9ISelector13ISelectionSet ()
> at /home/jcomellas/devel/d/mango_test/mango/io/selector/PollSelector.d:353
> #7 0x08057d69 in _D8selector12testSelectorFC5mango2io8selector5model9ISelector9ISelectorZv () at selector.d:142
> #8 0x08057c24 in _Dmain () at selector.d:66
> #9 0x0805a38a in main () at internal/dmain2.d:94
>
More information about the Digitalmars-d
mailing list