DLL crash inside removethreadtableentry - where's the source code for that?

Ben Davis entheh at cantab.net
Sat Feb 16 19:07:20 PST 2013


Hi,

The user-mode driver I'm working on (a 32-bit DLL) is crashing Windows 
Media Player on exit. (Two other host apps exit fine.) I can catch it in 
the Visual Studio debugger, but only see assembly language. Initially 
I'm just after tips on where to find source for the bits of D that are 
involved, but maybe someone will recognise the problem already...

I've gone through the assembly in some detail, and established that the 
crash is inside some removethreadtableentry() code which is called 
shortly before DllMain(DLL_THREAD_DETACH), and must look something like:

//tid is the Windows numeric thread ID for the current thread
removethreadtableentry(tid) {
   foreach (i, obj in someObjArray1024EntriesLong) {
     if (obj.someField == tid) goto foundIt;
   }
   return;

   //When we get here, i is 1 (pretend it's in scope)
   foundIt:
   free(obj.something);	//Does nothing, already 0
   if (obj.somethingElse) {  //Does nothing, already 0
     CloseHandle(obj.somethingElse);
   }
   free(obj);	//Crash inside this free()
}

Furthermore, I've established that:

- removethreadtableentry() doesn't get to foundIt for most threads.

- (almost certain) removethreadtableentry() isn't called at all for one 
of the two host apps that work fine; and is called but doesn't get to 
foundIt for the other app.

- (almost certain) removethreadtableentry() crashes the first time it 
gets to foundIt.

(These are almost certain in the sense that I only set the breakpoint 
after catching the first on-shutdown DLL_THREAD_DETACH, which means I 
may have missed one; but it's unlikely.)

So basically this seems to point to some buggy code that hardly ever 
runs, but does in my case. (Or it's designed for a slightly different 
use of DLLs or something like that.)

For reference, the assembly language I analysed is below, but I think 
the next step is if someone either wants to fix 
removethreadtableentry(), or direct me to the source so I can 
investigate further. (It is a D function, is it? It looks like D naming 
as opposed to Microsoft naming.)

I'm off to bed, but will pick this up again tomorrow.

Full detail follows (but probably isn't worth reading).

The call stack looks like this:

  	myproject.dll!RTLMultiPool::SelectFree()  + 0x17 bytes	C++
  	myproject.dll!__removethreadtableentry()  + 0x69 bytes	C++
  	myproject.dll!__DllMainCRTStartup at 12()  + 0x10c bytes	C++
  	ntdll.dll!_LdrpCallInitRoutine at 16()  + 0x14 bytes	
  	ntdll.dll!_LdrShutdownThread at 0()  + 0xe2 bytes	
  	ntdll.dll!_RtlExitUserThread at 4()  + 0x2a bytes	
  	kernel32.dll!@BaseThreadInitThunk at 12()  + 0x19 bytes	
  	ntdll.dll!___RtlUserThreadStart at 8()  + 0x27 bytes	
  	ntdll.dll!__RtlUserThreadStart at 8()  + 0x1b bytes	

When I view the assembly for __DllMainCRTStartup, I can see that this is 
the function directly responsible for calling my DllMain function. There 
seems to be only one place where it calls removethreadtableentry, and it 
seems to be before a call to DllMain.

When I look at the assembly for removethreadtableentry, it's trying to 
make the last call to 'free' before returning, as follows:

__removethreadtableentry:
05A88F64  push        eax
05A88F65  mov         ecx,dword ptr [esp+8]
05A88F69  xor         edx,edx
05A88F6B  push        ebx
05A88F6C  push        esi
05A88F6D  jmp         __removethreadtableentry+0Fh (5A88F73h)
05A88F6F  pop         esi
05A88F70  pop         ebx
05A88F71  pop         eax
05A88F72  ret
05A88F73  mov         eax,dword ptr [___thdtbl (5AADFBCh)]
05A88F78  mov         ebx,dword ptr [eax+edx*4]
05A88F7B  test        ebx,ebx
05A88F7D  je          __removethreadtableentry+20h (5A88F84h)
05A88F7F  cmp         dword ptr [ebx+18h],ecx
05A88F82  je          __removethreadtableentry+2Bh (5A88F8Fh)
05A88F84  inc         edx
05A88F85  cmp         edx,400h
05A88F8B  je          __removethreadtableentry+0Bh (5A88F6Fh)
05A88F8D  jmp         __removethreadtableentry+0Fh (5A88F73h)
05A88F8F  mov         dword ptr [esp+8],edx        *
05A88F93  mov         ecx,dword ptr [esp+8]
05A88F97  mov         edx,dword ptr [___thdtbl (5AADFBCh)]
05A88F9D  mov         esi,dword ptr [___thdtbl (5AADFBCh)]
05A88FA3  mov         ebx,dword ptr [edx+ecx*4]
05A88FA6  mov         dword ptr [esi+ecx*4],0
05A88FAD  push        dword ptr [ebx+20h]
05A88FB0  call        _free (5A87118h)
05A88FB5  add         esp,4
05A88FB8  cmp         dword ptr [ebx+1Ch],0
05A88FBC  je          __removethreadtableentry+63h (5A88FC7h)
05A88FBE  push        dword ptr [ebx+1Ch]
05A88FC1  call        dword ptr [__imp__CloseHandle at 4 (5A42B28h)]
05A88FC7  push        ebx
05A88FC8  call        _free (5A87118h)         <--------------------
05A88FCD  add         esp,4
05A88FD0  pop         esi
05A88FD1  pop         ebx
05A88FD2  pop         eax
05A88FD3  ret

The crash is then somewhere deep inside free().

Further debugging shows that removethreadtableentry is searching through 
a 1024-entry array of pointers, looking for a non-null pointer to an 
object for which the field at offset 0x18 is the current thread ID 
(which is in ecx). If it finds it, then it jumps to the point where I 
put the *. The crash seems to happen the very first time this line is 
hit (at least since I put the breakpoint there, which was after the 
first call into my DllMain).

So in summary: a number of threads (7 to 10) get successfully detached 
first, but weren't in the table that removethreadtableentry is 
searching. For the first thread to be found in that table, it crashed.

Finally, here's everything from the * to the call to free() (on a 
different run, so different addresses), with some values annotated:

//edx is 1, so it's the second entry in the table.
05C08F8F  mov         dword ptr [esp+8],edx
05C08F93  mov         ecx,dword ptr [esp+8]

//These set edx and esi to 0x05c2cd40.
05C08F97  mov         edx,dword ptr [___thdtbl (5C2DFBCh)]
05C08F9D  mov         esi,dword ptr [___thdtbl (5C2DFBCh)]

//ecx is 1, and ebx becomes 0x05c29b9b.
05C08FA3  mov         ebx,dword ptr [edx+ecx*4]
05C08FA6  mov         dword ptr [esi+ecx*4],0

//This pushes 0, and the call to free() does nothing.
05C08FAD  push        dword ptr [ebx+20h]
05C08FB0  call        _free (5C07118h)
05C08FB5  add         esp,4

//This is 0 and the CloseHandle call is skipped.
05C08FB8  cmp         dword ptr [ebx+1Ch],0
05C08FBC  je          __removethreadtableentry+63h (5C08FC7h)
05C08FBE  push        dword ptr [ebx+1Ch]
05C08FC1  call        dword ptr [__imp__CloseHandle at 4 (5BC2B28h)]

//ebx is unchanged from above, and this call crashes.
05C08FC7  push        ebx
05C08FC8  call        _free (5C07118h)

I also stepped inside free(), and the next interesting stuff happens 
here (note I skipped free() itself and went straight to RTLMultiPool):

RTLMultiPool::Free:
05C0AC68  push        ecx
05C0AC69  cmp         dword ptr [esp+8],0
05C0AC6E  je          RTLMultiPool::Free+15h (5C0AC7Dh)
05C0AC70  mov         eax,dword ptr [esp+8]
//eax is now 0x05c29b9b, the pointer we're trying to free
05C0AC74  lea         edx,[eax-4]
//edx is now eax-4 = 0x05c29b97
05C0AC77  push        edx
05C0AC78  call        RTLMultiPool::SelectFree (5C0AC34h)
...

RTLMultiPool::SelectFree:
05C0AC34  push        ecx
//This reads 0x05c29b97 into eax
05C0AC35  mov         eax,dword ptr [esp+8]
//This reads an address from where eax points, and edx is 0
05C0AC39  mov         edx,dword ptr [eax]
05C0AC3B  push        ebx
05C0AC3C  push        esi
//Looking at ecx+4 revealed the value 0x00000080 (128)
05C0AC3D  cmp         edx,dword ptr [ecx+4]
05C0AC40  ja          RTLMultiPool::SelectFree+21h (5C0AC55h)
//So we get here
05C0AC42  lea         ebx,[edx-1]  	//ebx = 0xffffffff
05C0AC45  shr         ebx,3  		//ebx = 0x1fffffff
05C0AC48  push        eax
05C0AC49  mov         esi,dword ptr [ecx]  //esi = 0x0516000c
05C0AC4B  mov         ecx,dword ptr [esi+ebx*4]  //crash!

I suppose esi + 0x1fffffff*4 is basically esi-4. But then we get:

Unhandled exception at 0x05c0ac4b (myproject.dll) in wmplayer.exe: 
0xC0000005: Access violation reading location 0x85160008.

//Here's the rest of SelectFree FWIW.
05C0AC4E  call        RTLPool::Free (5C0D460h)
05C0AC53  jmp         RTLMultiPool::SelectFree+2Dh (5C0AC61h)
05C0AC55  mov         ecx,dword ptr [RTLHeap::pMainHeap (5C2B4FCh)]
05C0AC5B  push        eax
05C0AC5C  call        RTLHeap::Free (5C0D6B4h)
05C0AC61  pop         esi
05C0AC62  pop         ebx
05C0AC63  pop         eax
05C0AC64  ret         4
05C0AC67  int         3


More information about the Digitalmars-d mailing list