DLL symbol identity
Benjamin Thaut via Digitalmars-d
digitalmars-d at puremagic.com
Thu May 7 22:25:59 PDT 2015
To implement shared libraries on a operating system level
generally two steps have to be taken
1) Locate which shared library provides a required symbol
2) Load that library and retrieve the final address of the symbol
Linux does both of those steps at program start up time. As a
result all symbols have identity. If a symbols appears in
multiple shared libraries only one will be used (first come first
serve) and the rest will remain unused.
Windows does step 1) at link time (through so called import
libraries). And Step 2) at program start up time. This means that
symbols don't have identity. If different shared libraries
provide the same symbol it may exist multiple times and multiple
instances might be in use.
Why is this important for D?
D uses symbol identity in a few places usually through the 'is'
operator. The most notable is type info objects.
bool checkIfSomeClass(Object o)
{
return typeid(o) is typeid(SomeClass);
}
The everyday D-user relies on this behavior usually when doing
dynamic casts.
Object o = ...;
SomeClass c = cast(SomeClass)o;
So if symbols don't have identity all places within druntime and
phobos which rely on symbol identity have to be identified and
changed to make it work with windows dlls. I'm currently at a
point in my Windows Dll implementation where I have to decide how
to solve this issue. There are two options now.
Option 1)
Leave as is, symbols won't have identity.
Con:
- It has a performance impact, because for making casts and other
features, which rely on type info objects, work we will have to
fallback to string comparisons on windows.
- All places within druntime and phobos which use symbol identity
have to be found and fixed. This is a lot of work and might
produce many bugs.
- Library writers have to consider this problem every time they
extend / modify druntime / phobos.
- There are going to be tons of threads on D.learn about "Why
does this not work in a Dll"
Pro:
- Its the plain windows shared library mechanism in all its
uglyness.
Option 2)
Windows already generates a indirection table we could patch.
Rebind the symbols at program start up time overwriting the
results of the windows program loader. Essentially reproducing
the behavior of linux with code in druntime.
Pro:
- Symbols would have identity.
- Everything would behave the same way as on Linux.
- No run time performance impact.
Con:
- Performance impact at program start up time.
- Might increase the binary size (I'm not entirely sure yet if I
can read all required information out of the binary itself or if
I have to add more myself)
I personally would prefer option 2 because it would be easier to
use and wouldn't cause lots of additional maintenance effort.
Any opinions on this? As both options would be quite some work I
don't wan't to start blindly with one and risking it being
rejected later in the PR.
Kind Regards
Benjamin Thaut
More information about the Digitalmars-d
mailing list