DLL crash inside removethreadtableentry - where's the source code for that?
Ben Davis
entheh at cantab.net
Sat Feb 16 19:07:20 PST 2013
Hi,
The user-mode driver I'm working on (a 32-bit DLL) is crashing Windows
Media Player on exit. (Two other host apps exit fine.) I can catch it in
the Visual Studio debugger, but only see assembly language. Initially
I'm just after tips on where to find source for the bits of D that are
involved, but maybe someone will recognise the problem already...
I've gone through the assembly in some detail, and established that the
crash is inside some removethreadtableentry() code which is called
shortly before DllMain(DLL_THREAD_DETACH), and must look something like:
//tid is the Windows numeric thread ID for the current thread
removethreadtableentry(tid) {
foreach (i, obj in someObjArray1024EntriesLong) {
if (obj.someField == tid) goto foundIt;
}
return;
//When we get here, i is 1 (pretend it's in scope)
foundIt:
free(obj.something); //Does nothing, already 0
if (obj.somethingElse) { //Does nothing, already 0
CloseHandle(obj.somethingElse);
}
free(obj); //Crash inside this free()
}
Furthermore, I've established that:
- removethreadtableentry() doesn't get to foundIt for most threads.
- (almost certain) removethreadtableentry() isn't called at all for one
of the two host apps that work fine; and is called but doesn't get to
foundIt for the other app.
- (almost certain) removethreadtableentry() crashes the first time it
gets to foundIt.
(These are almost certain in the sense that I only set the breakpoint
after catching the first on-shutdown DLL_THREAD_DETACH, which means I
may have missed one; but it's unlikely.)
So basically this seems to point to some buggy code that hardly ever
runs, but does in my case. (Or it's designed for a slightly different
use of DLLs or something like that.)
For reference, the assembly language I analysed is below, but I think
the next step is if someone either wants to fix
removethreadtableentry(), or direct me to the source so I can
investigate further. (It is a D function, is it? It looks like D naming
as opposed to Microsoft naming.)
I'm off to bed, but will pick this up again tomorrow.
Full detail follows (but probably isn't worth reading).
The call stack looks like this:
myproject.dll!RTLMultiPool::SelectFree() + 0x17 bytes C++
myproject.dll!__removethreadtableentry() + 0x69 bytes C++
myproject.dll!__DllMainCRTStartup at 12() + 0x10c bytes C++
ntdll.dll!_LdrpCallInitRoutine at 16() + 0x14 bytes
ntdll.dll!_LdrShutdownThread at 0() + 0xe2 bytes
ntdll.dll!_RtlExitUserThread at 4() + 0x2a bytes
kernel32.dll!@BaseThreadInitThunk at 12() + 0x19 bytes
ntdll.dll!___RtlUserThreadStart at 8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart at 8() + 0x1b bytes
When I view the assembly for __DllMainCRTStartup, I can see that this is
the function directly responsible for calling my DllMain function. There
seems to be only one place where it calls removethreadtableentry, and it
seems to be before a call to DllMain.
When I look at the assembly for removethreadtableentry, it's trying to
make the last call to 'free' before returning, as follows:
__removethreadtableentry:
05A88F64 push eax
05A88F65 mov ecx,dword ptr [esp+8]
05A88F69 xor edx,edx
05A88F6B push ebx
05A88F6C push esi
05A88F6D jmp __removethreadtableentry+0Fh (5A88F73h)
05A88F6F pop esi
05A88F70 pop ebx
05A88F71 pop eax
05A88F72 ret
05A88F73 mov eax,dword ptr [___thdtbl (5AADFBCh)]
05A88F78 mov ebx,dword ptr [eax+edx*4]
05A88F7B test ebx,ebx
05A88F7D je __removethreadtableentry+20h (5A88F84h)
05A88F7F cmp dword ptr [ebx+18h],ecx
05A88F82 je __removethreadtableentry+2Bh (5A88F8Fh)
05A88F84 inc edx
05A88F85 cmp edx,400h
05A88F8B je __removethreadtableentry+0Bh (5A88F6Fh)
05A88F8D jmp __removethreadtableentry+0Fh (5A88F73h)
05A88F8F mov dword ptr [esp+8],edx *
05A88F93 mov ecx,dword ptr [esp+8]
05A88F97 mov edx,dword ptr [___thdtbl (5AADFBCh)]
05A88F9D mov esi,dword ptr [___thdtbl (5AADFBCh)]
05A88FA3 mov ebx,dword ptr [edx+ecx*4]
05A88FA6 mov dword ptr [esi+ecx*4],0
05A88FAD push dword ptr [ebx+20h]
05A88FB0 call _free (5A87118h)
05A88FB5 add esp,4
05A88FB8 cmp dword ptr [ebx+1Ch],0
05A88FBC je __removethreadtableentry+63h (5A88FC7h)
05A88FBE push dword ptr [ebx+1Ch]
05A88FC1 call dword ptr [__imp__CloseHandle at 4 (5A42B28h)]
05A88FC7 push ebx
05A88FC8 call _free (5A87118h) <--------------------
05A88FCD add esp,4
05A88FD0 pop esi
05A88FD1 pop ebx
05A88FD2 pop eax
05A88FD3 ret
The crash is then somewhere deep inside free().
Further debugging shows that removethreadtableentry is searching through
a 1024-entry array of pointers, looking for a non-null pointer to an
object for which the field at offset 0x18 is the current thread ID
(which is in ecx). If it finds it, then it jumps to the point where I
put the *. The crash seems to happen the very first time this line is
hit (at least since I put the breakpoint there, which was after the
first call into my DllMain).
So in summary: a number of threads (7 to 10) get successfully detached
first, but weren't in the table that removethreadtableentry is
searching. For the first thread to be found in that table, it crashed.
Finally, here's everything from the * to the call to free() (on a
different run, so different addresses), with some values annotated:
//edx is 1, so it's the second entry in the table.
05C08F8F mov dword ptr [esp+8],edx
05C08F93 mov ecx,dword ptr [esp+8]
//These set edx and esi to 0x05c2cd40.
05C08F97 mov edx,dword ptr [___thdtbl (5C2DFBCh)]
05C08F9D mov esi,dword ptr [___thdtbl (5C2DFBCh)]
//ecx is 1, and ebx becomes 0x05c29b9b.
05C08FA3 mov ebx,dword ptr [edx+ecx*4]
05C08FA6 mov dword ptr [esi+ecx*4],0
//This pushes 0, and the call to free() does nothing.
05C08FAD push dword ptr [ebx+20h]
05C08FB0 call _free (5C07118h)
05C08FB5 add esp,4
//This is 0 and the CloseHandle call is skipped.
05C08FB8 cmp dword ptr [ebx+1Ch],0
05C08FBC je __removethreadtableentry+63h (5C08FC7h)
05C08FBE push dword ptr [ebx+1Ch]
05C08FC1 call dword ptr [__imp__CloseHandle at 4 (5BC2B28h)]
//ebx is unchanged from above, and this call crashes.
05C08FC7 push ebx
05C08FC8 call _free (5C07118h)
I also stepped inside free(), and the next interesting stuff happens
here (note I skipped free() itself and went straight to RTLMultiPool):
RTLMultiPool::Free:
05C0AC68 push ecx
05C0AC69 cmp dword ptr [esp+8],0
05C0AC6E je RTLMultiPool::Free+15h (5C0AC7Dh)
05C0AC70 mov eax,dword ptr [esp+8]
//eax is now 0x05c29b9b, the pointer we're trying to free
05C0AC74 lea edx,[eax-4]
//edx is now eax-4 = 0x05c29b97
05C0AC77 push edx
05C0AC78 call RTLMultiPool::SelectFree (5C0AC34h)
...
RTLMultiPool::SelectFree:
05C0AC34 push ecx
//This reads 0x05c29b97 into eax
05C0AC35 mov eax,dword ptr [esp+8]
//This reads an address from where eax points, and edx is 0
05C0AC39 mov edx,dword ptr [eax]
05C0AC3B push ebx
05C0AC3C push esi
//Looking at ecx+4 revealed the value 0x00000080 (128)
05C0AC3D cmp edx,dword ptr [ecx+4]
05C0AC40 ja RTLMultiPool::SelectFree+21h (5C0AC55h)
//So we get here
05C0AC42 lea ebx,[edx-1] //ebx = 0xffffffff
05C0AC45 shr ebx,3 //ebx = 0x1fffffff
05C0AC48 push eax
05C0AC49 mov esi,dword ptr [ecx] //esi = 0x0516000c
05C0AC4B mov ecx,dword ptr [esi+ebx*4] //crash!
I suppose esi + 0x1fffffff*4 is basically esi-4. But then we get:
Unhandled exception at 0x05c0ac4b (myproject.dll) in wmplayer.exe:
0xC0000005: Access violation reading location 0x85160008.
//Here's the rest of SelectFree FWIW.
05C0AC4E call RTLPool::Free (5C0D460h)
05C0AC53 jmp RTLMultiPool::SelectFree+2Dh (5C0AC61h)
05C0AC55 mov ecx,dword ptr [RTLHeap::pMainHeap (5C2B4FCh)]
05C0AC5B push eax
05C0AC5C call RTLHeap::Free (5C0D6B4h)
05C0AC61 pop esi
05C0AC62 pop ebx
05C0AC63 pop eax
05C0AC64 ret 4
05C0AC67 int 3
More information about the Digitalmars-d
mailing list