[Bug 97] New: Experiencing intermittent crash in rt_init() when loading DLL

gdc-bugzilla at gdcproject.org gdc-bugzilla at gdcproject.org
Sat Feb 1 02:23:24 PST 2014


http://bugzilla.gdcproject.org/show_bug.cgi?id=97

             Bug #: 97
           Summary: Experiencing intermittent crash in rt_init() when
                    loading DLL
    Classification: Unclassified
           Product: GDC
           Version: development
          Platform: x86
        OS/Version: Other
            Status: NEW
          Severity: critical
          Priority: Normal
         Component: gdc
        AssignedTo: ibuclaw at gdcproject.org
        ReportedBy: slavo5150 at yahoo.com


Created attachment 57
  --> http://bugzilla.gdcproject.org/attachment.cgi?id=57
rt_init-crash.png

This was migrated from
https://bitbucket.org/goshawk/gdc/issue/351/experiencing-intermittent-crash-in-rt_init

Manu Evans created an issue 2012-06-18
****************************************
I have a crash that only happens occasionally when loading a GDC-64 DLL. The
same DLL may work or not work depending on the direction of the wind. Though it
seems to crash far less often than it does.

The callstack, and various other details are visible in the image attached...

It appears to crash fetching __blkcache_storage, a TLS variable. The code that
loads it looks odd to me. A few points of interest:

* How can the final mov refer to rbx when only eax was loaded? Who's to say the
top bits will be zero?
* The magic address doesn't appear to be a valid offset to me...
* rsi is a good pointer, but it points to a bunch of string data, including
source code snippets. Not what I expected... moduleinfo of some sort?
debuginfo?
* The same pattern of loading ebx and using rbx is repeated above with
eax->rax, except the wild absolute magic number is dereferenced this time...
(how does that even work?) 

I don't follow the code GDC is generating here :/ .. Does it look okay to
anyone else?

This is affecting our whole team daily... any input or ideas what might be
going on would be much appreciated!

See attachment <rt_init-crash.png>

Manu Evans - 2012-06-18
****************************************
edited description


Iain Buclaw - 2012-06-19
****************************************
* changed status to open
* assigned issue to Daniel Green 

I'm not sure rt.lifetime is well suited for shared libraries on windows yet.

Daniel, could you look into this?


Manu Evans - 2012-06-19
****************************************
I can probably supply a binary... but I think the precise context when loading
the dll is critical, because it usually loads fine without problems, so a
binary may not be of any use.


Daniel Green - 2012-06-20
****************************************
Looking at that, it's definitely crashing in loading a TLS variable.

* The use of EBX/RBX is acceptable as 32-bit operations are implicitly zero
extended to 64-bit.
* The issue looks to be with the value being loaded. 0xACEE47F8 ( 3 billion )
* RSI should be ok, as it doesn't require TLS relative location to function. 

This value loaded into EBX should be relative to the TLS section which means it
should be significantly smaller.

Did you custom build this? The value loaded into EBX is determined at link time
without a TLS aware assembler/linker you wouldn't get the relative offset.

Can you output a map file with -Wl,-Map=output.map for the DLL and the output
of the following command on lifetime.o? To extract it run

ar x libgphobos.a lifetime.o

Then compare it with this? Line 2ea is where the magic happens that generates
the relative offest.

I'll work on getting the assembly output from a Dll dump as well to ensure it's
linking properly.  

$ /c/MinGW64/bin/objdump.exe -d -r -M Intel lifetime.o 
00000000000002d0 <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>:
     2d0:    56                       push   rsi
     2d1:    53                       push   rbx
     2d2:    48 83 ec 28              sub    rsp,0x28
     2d6:    8b 04 25 00 00 00 00     mov    eax,DWORD PTR ds:0x0
            2d9: R_X86_64_32S    _tls_index
     2dd:    65 48 8b 34 25 58 00     mov    rsi,QWORD PTR gs:0x58
     2e4:    00 00 
     2e6:    48 8b 34 c6              mov    rsi,QWORD PTR [rsi+rax*8]
     2ea:    bb 08 00 00 00           mov    ebx,0x8
            2eb: secrel32    .tls$GCC
     2ef:    48 8b 04 1e              mov    rax,QWORD PTR [rsi+rbx*1]
     2f3:    48 85 c0                 test   rax,rax
     2f6:    74 08                    je     300
<_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x30>
     2f8:    48 83 c4 28              add    rsp,0x28
     2fc:    5b                       pop    rbx
     2fd:    5e                       pop    rsi
     2fe:    c3                       ret    


Daniel Green - 2012-06-24
****************************************
Here's the Map information from a DLL that was built.

.tls            0x000000006fa62000      0x200
                0x000000006fa62010               
_D2rt8lifetime18__blkcache_storagePS2rt8lifetime7BlkInfo

Subtracting _D2rt8lifetime18blkcache_storagePS2rt8lifetime7BlkInfo from .tls,
for a secrel32 offset gives 0x10.

This is the same value as shown in the assembly dump from the Dll and is in
contrast with the value 0xACEE47F8 as shown in your assembly dump.

000000006fa0972b <_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo>:
        int __nextRndNum = 0;
    }
    int __nextBlkIdx;
}

@property BlkInfo *__blkcache()
    6fa0972b:    55                       push   rbp
    6fa0972c:    48 89 e5                 mov    rbp,rsp
    6fa0972f:    48 83 ec 30              sub    rsp,0x30
{
    if(!__blkcache_storage)
    6fa09733:    8b 04 25 4c c4 a5 6f     mov    eax,DWORD PTR ds:0x6fa5c44c
    6fa0973a:    65 48 8b 14 25 58 00     mov    rdx,QWORD PTR gs:0x58
    6fa09741:    00 00 
    6fa09743:    48 8b 14 c2              mov    rdx,QWORD PTR [rdx+rax*8]
    6fa09747:    b8 10 00 00 00           mov    eax,0x10
    6fa0974c:    48 8b 04 02              mov    rax,QWORD PTR [rdx+rax*1]
    6fa09750:    48 85 c0                 test   rax,rax
    6fa09753:    0f 95 c0                 setne  al
    6fa09756:    83 f0 01                 xor    eax,0x1
    6fa09759:    84 c0                    test   al,al
    6fa0975b:    74 5f                    je     6fa097bc
<_D2rt8lifetime10__blkcacheFNdZPS2rt8lifetime7BlkInfo+0x91>

Manu Evans - 2012-06-24
****************************************
The objdump gave us the same thing you pasted.

So you think it's just a bad toolchain? Any chance of a new 2.059 toolchain
with that patch applied? Will that fix the problem?

It's very strange that this only occurs occasionally. You'd think this would
cause the DLL to fail to load every time... but we only see it fail
occasionally. Other times it loads just fine

Does LoadLibrary actually patch the offsets in the loaded binary with the
absolute addresses as it loads or something?


Daniel Green - 2012-06-24
****************************************
Can you generate a Map file for the library? I'd like to compare the offsets
with what's in your assembly code.

If the objdump produced the secrel32 output, then it's probably something else.
With the data I had, that was the most likely scenario. The TLS patch fixes a
bug in the linker as well as giving access to secrel32 relocation in assembly.
If for some reason the compile/link phase was using binutils(gas or ld) not
included with GDC this type of issue would occur. It's still possible a
different linker is being used. ld bug

In order to figure out what else it could be, it's necessary to see the map
file and raw assembly for your dll.

Compile or link with -Wl,-Map=output.map.

objdump.exe -S -M intel mydll.dll > mydll.asm

Can be used to generate intel formatted assembly.

Random failures on accessing invalid memory are not as strange as you might
think. That's actually the first clue, you're accessing an invalid memory
location.

I'll look into the runtime behavior of LoadLibrary after I've checked the map
and assembly output of the Dll.

-- 
Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.


More information about the D.gnu mailing list