Openwrt Linux Uclibc ARM GC issue

Radu void at null.pt
Sun Dec 17 17:12:41 UTC 2017


On Friday, 15 December 2017 at 14:24:08 UTC, David Nadlinger 
wrote:
> On 15 Dec 2017, at 14:06, Radu via digitalmars-d-ldc wrote:
>> When run, I get this error spuriously:
>>
>> ====================================
>> core.exception.AssertError at rt/sections_elf_shared.d(116): 
>> Assertion failure
>> Fatal error in EH code: _Unwind_RaiseException failed with 
>> reason code: 9
>> Aborted (core dumped)
>> ====================================
>
> The assert is inside an invariant which checks that the TLS 
> information has been extracted successfully. Perhaps uclibc 
> uses a TLS implementation that is not ABI-compatible with 
> glibc? (druntime needs to determine the TLS ranges to register 
> them with the GC, for the main thread as well as newly spawned 
> ones.)
>
> Where in the program lifecycle does the error occur? From the 
> backtrace, it looks like during C runtime startup, in which 
> case I am not quite seeing the connection to the GC.
>
> Why unwinding fails is another question, but not one I would be 
> terribly worried about – it is possible that the error e.g. 
> just occurs too early for the EH machinery to be properly set 
> up yet. Other low-level parts of druntime have been converted 
> to directly abort (e.g. using assert(0)) instead. In fact, I am 
> about to overhaul sections_elf_shared in that respect anyway to 
> improve error reporting when mixing shared and non-shared 
> builds.
>
>  — David

My various attempts on getting it to run behaved very erratic.
So I changed the parameters for cross compile, basically I 
removed all architecture specifics leaving only 
`-mtriple=arm-linux-gnueabihf`, and `-mfloat-abi=hard` on C side.

My testing hardware is a ARM Cortex-A7, http://linux-sunxi.org/A33

With the compiler switches changed I could run my test program 
and try the druntime test runner (albeit with some changes on 
math and stdio to get it linking):

./druntime-test-runner
0.000s PASS release32 core.atomic
0.000s PASS release32 core.bitop
0.000s PASS release32 core.checkedint
0.005s PASS release32 core.demangle
0.000s PASS release32 core.exception
0.002s PASS release32 core.internal.arrayop
0.000s PASS release32 core.internal.convert
0.000s PASS release32 core.internal.hash
0.000s PASS release32 core.internal.string
0.000s PASS release32 core.math
0.000s PASS release32 core.memory
0.002s PASS release32 core.sync.barrier
0.015s PASS release32 core.sync.condition
0.000s PASS release32 core.sync.config
0.016s PASS release32 core.sync.mutex
0.016s PASS release32 core.sync.rwmutex
0.002s PASS release32 core.sync.semaphore
Segmentation fault (core dumped)

The seg fault is from core.thread:1351

unittest
{
     auto t1 = new Thread({
         foreach (_; 0 .. 20)
             Thread.getAll;
     }).start;
     auto t2 = new Thread({
         foreach (_; 0 .. 20)
             GC.collect; // this seg faults
     }).start;
     t1.join();
     t2.join();
}

Calling GC.collect from the main thread doesn't seg fault.

Core dump is not very helpful - stack is garbage, but running 
with gdbserver a minimal program with the unit test I can see 
this:

Thread 1 "test" received signal SIGUSR1, User defined signal 1.
pthread_getattr_np (thread_id=0, attr=0xb6b302bc) at 
libpthread/nptl/pthread_getattr_np.c:47
47        iattr->schedpolicy = thread->schedpolicy;
(gdb) step

Thread 1 "test" received signal SIGUSR2, User defined signal 2.
0xb6e50d80 in epoll_wait (epfd=-1090521272, events=0x8, 
maxevents=2, timeout=-1224756080) at 
libc/sysdeps/linux/common/epoll.c:58
58      CANCELLABLE_SYSCALL(int, epoll_wait, (int epfd, struct 
epoll_event *events, int maxevents, int timeout),
(gdb) step

Thread 1 "test" received signal SIGSEGV, Segmentation fault.
0xfffffffc in ?? ()
(gdb)




More information about the digitalmars-d-ldc mailing list