DLL symbol identity

Benjamin Thaut via Digitalmars-d digitalmars-d at puremagic.com
Thu May 7 22:25:59 PDT 2015


To implement shared libraries on a operating system level 
generally two steps have to be taken

1) Locate which shared library provides a required symbol
2) Load that library and retrieve the final address of the symbol

Linux does both of those steps at program start up time. As a 
result all symbols have identity. If a symbols appears in 
multiple shared libraries only one will be used (first come first 
serve) and the rest will remain unused.

Windows does step 1) at link time (through so called import 
libraries). And Step 2) at program start up time. This means that 
symbols don't have identity. If different shared libraries 
provide the same symbol it may exist multiple times and multiple 
instances might be in use.

Why is this important for D?
D uses symbol identity in a few places usually through the 'is' 
operator. The most notable is type info objects.

bool checkIfSomeClass(Object o)
{
   return typeid(o) is typeid(SomeClass);
}

The everyday D-user relies on this behavior usually when doing 
dynamic casts.
Object o = ...;
SomeClass c = cast(SomeClass)o;

So if symbols don't have identity all places within druntime and 
phobos which rely on symbol identity have to be identified and 
changed to make it work with windows dlls. I'm currently at a 
point in my Windows Dll implementation where I have to decide how 
to solve this issue. There are two options now.

Option 1)
Leave as is, symbols won't have identity.

Con:
- It has a performance impact, because for making casts and other 
features, which rely on type info objects, work we will have to 
fallback to string comparisons on windows.
- All places within druntime and phobos which use symbol identity 
have to be found and fixed. This is a lot of work and might 
produce many bugs.
- Library writers have to consider this problem every time they 
extend / modify druntime / phobos.
- There are going to be tons of threads on D.learn about "Why 
does this not work in a Dll"

Pro:
- Its the plain windows shared library mechanism in all its 
uglyness.

Option 2)
Windows already generates a indirection table we could patch. 
Rebind the symbols at program start up time overwriting the 
results of the windows program loader. Essentially reproducing 
the behavior of linux with code in druntime.

Pro:
- Symbols would have identity.
- Everything would behave the same way as on Linux.
- No run time performance impact.

Con:
- Performance impact at program start up time.
- Might increase the binary size (I'm not entirely sure yet if I 
can read all required information out of the binary itself or if 
I have to add more myself)



I personally would prefer option 2 because it would be easier to 
use and wouldn't cause lots of additional maintenance effort.

Any opinions on this? As both options would be quite some work I 
don't wan't to start blindly with one and risking it being 
rejected later in the PR.

Kind Regards
Benjamin Thaut


More information about the Digitalmars-d mailing list