[Issue 15939] GC.collect causes deadlock in multi-threaded environment

Wed May 11 15:13:14 PDT 2016

https://issues.dlang.org/show_bug.cgi?id=15939

--- Comment #13 from Aleksei Preobrazhenskii <apreobrazhensky at gmail.com> ---
I saw new deadlock with different symptoms today. 

Stack trace of collecting thread:

Thread XX (Thread 0x7fda6ffff700 (LWP 32383)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
#1  0x00000000007b4046 in thread_suspendAll ()
#2  0x00000000007998dd in gc.gc.Gcx.fullcollect() ()
#3  0x0000000000797e24 in gc.gc.Gcx.bigAlloc() ()
#4  0x000000000079bb5f in
gc.gc.GC.__T9runLockedS47_D2gc2gc2GC12mallocNoSyncMFNbmkKmxC8TypeInfoZPvS21_D2gc2gc10mallocTimelS21_D2gc2gc10numMallocslTmTkTmTxC8TypeInfoZ.runLocked()
()
#5  0x000000000079548e in gc.gc.GC.malloc() ()
#6  0x0000000000760ac7 in gc_qalloc ()
#7  0x000000000076437b in _d_arraysetlengthT ()
...application stack

Stack traces of other threads:

Thread XX (Thread 0x7fda5cff9700 (LWP 32402)):
#0  0x00007fda78927454 in do_sigsuspend (set=0x7fda5cff76c0) at
../sysdeps/unix/sysv/linux/sigsuspend.c:63
#1  __GI___sigsuspend (set=<optimized out>) at
../sysdeps/unix/sysv/linux/sigsuspend.c:78
#2  0x000000000075d979 in core.thread.thread_suspendHandler() ()
#3  0x000000000075e220 in core.thread.callWithStackShell() ()
#4  0x000000000075d907 in thread_suspendHandler ()
#5  <signal handler called>
#6  pthread_cond_wait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:160
#7  0x0000000000760069 in core.sync.condition.Condition.wait() ()
...application stack

All suspending signals were delivered, but it seems that number of calls to
sem_wait was different than number of calls to sem_post (or something similar).
I have no reasonable explanation for that.

It doesn't invalidate the hypothesis that RT signals helped with original
deadlock though.

--