Big picture on shared libraries when they go wrong, how?
Richard Andrew Cattermole (Rikki)
richard at cattermole.co.nz
Mon May 6 03:28:47 UTC 2024
This post is meant to be a highly enlightening and entertaining
explanation (or should I say it shouldn't cure anyones insomnia)
of just how many things can go wrong with shared libraries if
they are not worked with right regardless of platform.
Now I know this is an utter wall of text, but if you want to work
with shared libraries you probably should read all of this. It'll
get you up to speed on the theory of using them by preventing a
repeat of my experiences, no war stories for you!
If you have inside knowledge of how shared libraries work, please
expand upon this in the comments, perhaps we can get an article
out of it for the site.
Some of the advice in this article may go against your previous
experiences working with shared libraries. The recommendations
here exist because the alternatives have seen to be problematic
for a large portion of support requests over a two year period.
If you understand what you are doing, you of course can disregard
a particular piece and may want to expand upon or refine what the
information that is being given here so that we can create a
great overview of the subject for future programmers to learn
from!
Latest copy can be found
[here](https://gist.github.com/rikkimax/c2b501e64e3cca6d59343a286e5466df).
## Glossary
Before we begin to get into actual content we should probably
cover some basic terms.
- Binary (within a process, can be known as an image or module):
An executable or shared library.
- Static library: An archive containing one or more object files.
- Shared library: A reusable and multi-loadable binary that
typically does not contain an entry point function.
- Out of binary: A symbol that does not exist in the current
(compiling/linking) binary.
- Visiblity override switch: A compiler switch that changes the
default symbol mode of symbols, unless stated otherwise.
- DllImport override switch: A compiler switch that changes the
default symbol mode for symbols that are external, unless stated
otherwise.
- Silo'd: a library that is unaware of other instances of itself
(my own definition for the usage of this article).
- Isolated: a library that is sandboxed so that no resources can
cross into other code (my own definition for the usage of this
article).
## Table of Contents
- Common Mistakes
> Not asking for help in understanding the theory behind shared
libraries, linking and loading in general is going to lead to
failure for your project. No matter how good you are with this
stuff, help will be needed at some point.
- Things That are Not Covered
> Not everything has been described here that can impact shared
libraries usage in D. It is not a tutorial, but a reference for
before you start using them.
- Is a Dynamic Link Library a Shared Library?
> Yes, but they make it easy to think otherwise!
- Import Libraries are Special Yes?
> There is nothing special about import libraries, don't export
global variables, oh and you should probably just link against a
DLL dynamically!
- Symbol Modes Make Ya Go Mad!
> When dealing with shared libraries there are three modes a
symbol can be in ``Internal``, ``DllImport`` and ``DllExport``.
Setting these up right are the core problem that results in both
linkage failures and runtime errors.
- Not Everything Should Be Exported
> Just because something can be exported, doesn't mean it
should be, i.e. TLS.
- Symbol, What Symbol?
> Current language is not very helpful with any generated
symbols and this can lead to program corruption.
- Knowing When to DllImport
> Current solutions are too broad, inconsistent and will out
right result in linker errors without any compiler assistance.
They outright prevent intermediary usage of static libraries and
object files without issues arrising.
- Why Not Intermediary Static Libraries?
> A static library does not fully get included, eliding FTW!
Use object files for intermediaries rather than static libraries
for anything that gets exported.
- It is Loaded, Works Yes?
> Just because it linked, doesn't mean it'll load even with the
right dependencies and the behavior of loaders are not consistent
between platforms.
- Unloading
> To keep your sanity, don't unload a shared library unless your
process is dieing.
- Initializing Your Shared Library
> A shared library that allows you to borrow resources it owns,
and borrows from another is full of failure modes that may not be
avoidable.
- TLS Hooking
> Only Windows offer hooking of threads, which supports zero or
more ``DllMain`` and for druntime should be automatically
injected.
- Scenario: Your Own Memory Allocator
> The order of deinitialization can matter between siblings
shared libraries, if you can avoid letting a sibling shared
library borrow resources from you, you should avoid it.
- Scenario: Your Own Threads
> If you're going to do your own threads, don't forget to
register them with druntime and handle cyclic registration to and
from.
- Where Is Thy Runtime?
> Did you follow my advice in ``Unloading``, no? Well good luck
with that. If you have a runtime loaded don't have duplicates of
it, stick to a single shared library build of it.
- Who Needs a Scope Anyway?
> Go ahead be smart! Don't use shared libraries or static
libraries, go import only! See how quickly you kill off that
scope that depends on having state.
## Common Mistakes
TLDR: Not asking for help in understanding the theory behind
shared libraries, linking and loading in general is going to lead
to failure for your project. No matter how good you are with this
stuff, help will be needed at some point.
I would write a lot more here, but currently the language and the
tooling simply does not assist you in getting what you need sent
to the linker sent.
- You cannot tell the compiler that a module is not in your
binary. See: ``Knowing When to DllImport``. My DIP fixes this.
- You cannot tell the compiler that something is private,
actually needs to be exported and have it work correctly
(``export`` is currently a visibility modifier). See Atila's
DConf 2023 talk [``You're Writing D Wrong--Átila
Neves``](https://www.youtube.com/watch?v=Rm_8Hpex68s) as to why
this is very worrying that we cannot do it currently. This is
something my DIP resolves.
- If you are able to tell the compiler that a type needs to be
exported, it will not export things it generates leading to it
not work anyway. See: ``Symbol, What Symbol?``. Another thing my
DIP fixes.
- If it does work, its going to cause silent program corruption.
See: ``Symbol, What Symbol?``.
In general if you're going to work with shared libraries, you
will likely run into situations where you need help. Buying,
reading and learning from [Linkers &
Loaders](https://www.amazon.com.au/Linkers-Loaders-John-Levine/dp/1558604960) is not going to be enough to get you to a successful outcome.
## Things That are Not Covered
TLDR: Not everything has been described here that can impact
shared libraries usage in D. It is not a tutorial, but a
reference for before you start using them.
- No D code with build file examples
- Exceptions
- Template instantiations that cross the shared library boundary
## Is a Dynamic Link Library a Shared Library?
TLDR: Yes, but they make it easy to think otherwise!
So let's start with something simple, a Dynamic Link Library
(DLL) is not a shared library. This is not an accurate statement,
as a DLL facilitates the role that a shared library does on
non-Windows systems. As an issue this come up in a few places
such as [Windows System Programming 3rd edition pg.
150](https://www.amazon.com/Windows-System-Programming-Johnson-Hart/dp/0321256190), [documentation for GetFullPathNameA](https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getfullpathnamea), [an answer on stack overflow](https://stackoverflow.com/a/62517860).
The shared library model is notable because of the reusable
nature of a binary that the OS loader can merge into your
process. Either during initial load started by the kernel or
during execution of your program at your request.
Of note is that each binary that makes up a process (executable
vs shared library) are _not_ isolated. They are **_merged_**.
Once merged the only thing preventing exposure of one to another,
is the symbol table that the kernel keeps for each binary which
is used for patching.
In another section ``Where Is Thy Runtime`` I describe a library
that is silo'd, this just means it does not know about other
things in the process. Isolation on the other hand would refer to
sandboxing which as far as I am aware no OS does.
Okay so how is that entertaining? Great question, due to the
indirection introduced by DLL's it can appear that they are in
fact isolated which can lead to quite some interesting moments!
## Import Libraries are Special Yes?
TLDR: There is nothing special about import libraries, don't
export global variables, oh and you should probably just link
against a DLL dynamically!
Whenever you link a binary you may have noted a corresponding
file has been created along with it. This is an import library,
it was generated by the linker when it saw that you exported
something. These are quite informational, they tell you what
symbols were exported, but more importantly they tell a future
linker invocation about them too!
Not all platforms use these files, others such as Linux rely on
what is in the binary to provide this information solely. On
Windows they utilize by import libraries and information in the
shared library to map their symbols which works great for their
commercially concerned OS!
So what are import libraries? Some custom format or other
horrendous thing to never learn about?
No! In fact they are just regular static libraries! If you can
emit a static library you can probably create your own without
much work.
The two main things that they contain which are of interest is
the extern symbols that have ``_imp`` prefixed to their name and
wrappers to these symbols where a simple jump (or similar) to
what is pointed at. ``jmp [_imp_symbol];`` these are symbols are
generated to have the original symbol name (without the ``_imp``).
Those generated wrappers are why the druntime bindings to WinAPI
currently work, without ``DllImport`` support being cleanly
defined and in active use by the language!
This has another interesting tidbit, you should **_only_** have
the ability to export functions, not global variables. You can
see this in [Microsoft's
libc](https://learn.microsoft.com/en-us/cpp/c-runtime-library/errno-doserrno-sys-errlist-and-sys-nerr?view=msvc-170) how they have it to be a function call in a macro.
What is great about this is in practice there is no difference
between linking against a shared library statically (using
linker) or loading dynamically (using loader yourself). Either
way you're dealing with an indirection of using a global pointer!
So if you're ever asking yourself if you should statically or
dynamically link against a shared library on Windows, you should
probably link dynamically unless you're distributing the end
binary as it makes no difference when using a symbol.
## Symbol Modes Make Ya Go Mad!
TLDR: When dealing with shared libraries there are three modes a
symbol can be in ``Internal``, ``DllImport`` and ``DllExport``.
Setting these up right are the core problem that results in both
linkage failures and runtime errors.
In the traditionally applied (POSIX) shared library model, the
only symbol modes relevant to discussion are internal versus
external. An external symbol is one not defined in a given
binary, and internal is found within. However just because a
symbol is internal does not mean it has its symbol name known or
accessible to other binaries to link against.
Along came Windows DLL's and we no longer use internal versus
external terminology with shared libraries although it is still
relevant to object files and it is how linkers and loaders still
operate at the lowest level even if we are no longer operating
solely within it. Now we use ``Internal``, ``DllImport`` and
``DllExport`` regardless of the platform.
- An Internal symbol is a symbol that is found in a binary that
is not directly accessible by name externally to that binary.
- A ``DllImport`` symbol is a symbol that is not found in the
current binary and is external to it. For Windows specifically
this refers to the symbol having indirection via a global pointer
to the internal symbol. See ``_imp`` prefixed symbols in import
libraries heading above.
- A ``DllExport`` symbol is a. internal symbol that has an
exportation linker flag applied to it. Traditionally this will
expose the symbol name for the symbol. For Windows it will hide
the internal symbol and instead expose a new global variable
which is a pointer, using the name with the prefix ``_imp``, that
points to the internal symbol.
Each platform has its own tunings to the shared library model,
both OSX and Linux may both be POSIX, but they each have their
own behaviors that are not necessarily POSIX compliant.
LLVM has some explanations for these modes, there are many others
they support although they are not relevant to this document. For
[internal](https://llvm.org/docs/LangRef.html#linkage-types), and
for
[DllImport/DllExport](https://llvm.org/docs/LangRef.html#dll-storage-classes).
Symbol modes are the heart and sole of the majority of issues
relating to shared library support in the language. Most
specifically what should be exported automatically, and when do
we apply ``DllImport`` instead of Internal.
### Not Everything Should Be Exported
TLDR: Just because something can be exported, doesn't mean it
should be, i.e. TLS.
The vast majority of symbols that are user written (not compiler
generated) error due to the symbol modes ``DllImport`` and
``Internal`` being mixed up. But sometimes ``DllExport`` can
cause issues for both generated and user written symbols.
According to [Ulrich
Drepper](https://www.akkadia.org/drepper/dsohowto.pdf) and at
least one other [Stack overflow
user](https://stackoverflow.com/a/32701238) C
constructors/destructors on linux do not need to be exported.
Since it is not required to be exported, exporting can only
invite problems when it is done unnecessarily. See the [bug
ticket](https://issues.dlang.org/show_bug.cgi?id=24536) to track
disallowing exportation of functions marked as such.
Alternatively another set of issues can be seen with generated
symbols such as ``ModuleInfo`` or ``TypeInfo``. By not exporting
``ModuleInfo`` and assuming it is available the compiler
introduces a hidden dependency on a generated symbol that may not
exist.
This is a bit of problem with shared libraries. Especially when a
D file could actually be a binding to a C library (like Deimos).
See these two tracking issues for ``ModuleInfo`` exportation
problems [Export
ModuleInfo](https://issues.dlang.org/show_bug.cgi?id=231770) and
[Remove
dependency](https://issues.dlang.org/show_bug.cgi?id=23974).
Unfortunately the removal of the dependency can only work
correctly if you know that the module is out of binary or you end
up with fun situations where a dependency module does not
initialize before you try to access it.
See ``Why Not Intermediary Static Libraries?`` for an explanation
on why a static library should not contain exports.
Thread local variables (TLS), Fiber local variables (FLS) are
examples of specialty global variables that should never be
exported. The scheme used for each depends on the platform and
can change over time (Android has recently changed its TLS scheme
for instance).
The global itself could be a key into some sort of map that the
operating system provides, or emulated by the toolchain into
existing. The creation of the key into map may be done by user
code, as done with
[pthread](https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html) and [Win32](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc) which has explicit mention that the handle may not cross the DLL boundary.
Instead of exporting a TLS variable you can wrap the access to
the storage pointer by a function that returns it. This should be
done automatically by the compiler or disallowed.
## Symbol, What Symbol?
TLDR: Current language is not very helpful with any generated
symbols and this can lead to program corruption.
So you've got yourself a fancy pants type and you've done
everything right. Exported all the symbols that don't get
exported automatically (that the compiler is supposed to
exporting for you since you can't in language), annotated with
export on the type and methods itself but... you get a segfault
when you used it. What would you do?
I have had to deal with this very situation before multiple times
when the D code looks like this:
```d
MyType var;
var = MyType(...);
```
It looks like it should be working fine! The segfault isn't even
in this function!!! How is this code buggy? Well you won't
believe this... but that variable initialization, didn't
initialize.
See dmd is rather "_helpful_" even though it didn't know that the
``.init`` symbol is in ``DllImport`` mode rather than
``Internal``, and because of the way the codegen works it still
linked and didn't cause any memory corruption!
So when the copy from the ``.init`` symbol to the stack occurs it
sees a zero length, and it thinks I'm done! Wahoo, I did the
thing. Except it didn't do the thing. In fact it did zero of the
things it was meant to do.
What you end up with is a variable with junk left over stack data
which can be pretty much anything. This shows up very easily when
you are dealing with library based reference counting, due to the
atomic alignment check. Not a fun time to be had.
This shows us how important it is to export symbols generated
from a type automatically when other symbols have been explicitly
exported. D has a lot of house keeping symbols that get
generated, including ``opCmp``! All of these must be handled for
you, or it hasn't got a chance to work and there will be a lot of
distractions requiring a significant amount of debugging to
resolve.
## Knowing When to DllImport
TLDR: Current solutions are too broad, inconsistent and will out
right result in linker errors without any compiler assistance.
They outright prevent intermediary usage of static libraries and
object files without issues arrising.
So we've so far covered how the compiler needs to assist with
exportation automatically and that you must have a way to put a
symbol into ``DllExport`` mode, but we still have to cover
``DllImport``, and what the compiler can do to assist you.
Nothing. It cannot help you. It will get it wrong, things will
not link.
So it is fully on you to put symbols into ``DllImport`` mode, and
that right there is the giant problem, how do you do this?
Well you can start with the ``dllimport`` override switch that
ldc has introduced. But you are limited to either system
libraries like druntime and phobos, or every shared library.
There is no finer grained solution as part of CLI switches
currently.
If you do it in code, now suddenly you have to maintain both an
interface file and the source file. Oh did I mention that the
compiler can't help here either? Yeah... the D interface
generator has no knowledge of if you want the resulting file to
be used for a static library or shared library. Even if it was
going to work, it isn't going to work for you today.
So you have got to annotate per symbol that it is in
``DllImport`` mode. In my DIP for exportation I changed this to
have the consistent syntax of ``export`` with ``extern`` and this
applies to all symbols.
Still this isn't a good enough situation, doesn't help build
managers and certainly is a major pain, obviously nobody is going
to do this manually if they have a choice.
While it is great to have a fine grained solution (including
conditionally) for setting ``DllImport`` mode, this shouldn't be
your primary way of setting up the symbol modes.
There is an alternative that works great as a story for both
build managers and for people who don't know anything about _why_
it exists!
The external import path switch ``-extI`` this is a switch I have
proposed similar to ``-I``. If you understand the import path
switch you can understand that the external import switch is just
for modules found in a shared library. Easy swap!
From a compiler perspective it knows that any module found from
an external import switch is found in another binary, and if its
from the import switch that it can be found from the currently
compiling binary!
This enables it to switch any found ``DllExport`` symbols to
``DllImport`` without any action on each symbol by the
programmer. How wonderful!
But what if we didn't annotate with ``export`` and instead used
the visiblity override switch to set exportation, well use the
``dllimport`` override switch to apply to all symbols found from
a external module. Great, more compiler assistance with minimal
changes!
But why not use the override switches isn't this good enough? No,
no it is not. It's too broad.
Without the ability to pick which modules are out of binary,
versus being linked into the current binary you get [linker
warnings](https://learn.microsoft.com/en-us/cpp/error-messages/tool-errors/linker-tools-warning-lnk4286?view=msvc-170) and they exist because you are out right doing the wrong thing by adding extra indirection (which may not have been enabled by the (lacking, or different setting) of visibility override switch).
This has the unfortunate casualty of no static library or object
file intermediaries without causing problems.
## Why Not Intermediary Static Libraries?
TDLR: A static library does not fully get included, eliding FTW!
Use object files for intermediaries rather than static libraries
for anything that gets exported.
So you've been a good programmer, split up your code base so that
there are intemediary compilation steps to enable faster rebuild
times and proper scoping of project work. Nothing could go wrong
with that when it comes to shared libraries right? Right???
Oh how are you naive! There is so much wrong with this that
you're going to rethink everything you have ever done.
So linkers don't just include a static library whole, it only
includes an object file that it contains if something references
it by default. Great for when you are building executables, not
so great when you are constructing a shared library from static
libraries containing exports that do not get pulled in by
anything.
Unfortunately while there is a way to [force
it](https://learn.microsoft.com/en-us/cpp/build/reference/wholearchive-include-all-library-object-files?view=msvc-170), you need to know the static libraries name and can be a bit buggy depending on the linker in question. Only resonable solution to this is to use object files, that do not get elided.
According to Adam Wilson, the recommendation from Microsoft
internally is to not export from static libraries and this makes
sense given the above issues. So while you can use a static
library to contribute towards your shared library, it should not
be providing any exported symbols.
This is problematic with dub, as it does not support object files
currently. See this
[ticket](https://github.com/dlang/dub/issues/2633) for a
potential redesign of how dub works with target types.
You should also be aware that with both of the override switches
(``visibility`` and ``dllimport``) you will not have fined
grained control over exports in a static library versus object
files in dub today based upon the (sub)package. There are
multiple things that will need to be done to enable people to
prevent running afoul of these recommendations whilst still
enabling full control.
To further complicate matters, if you want to fully isolate a
static library neither dub nor the compiler can assist you (by
using the .di generator). This will require further research to
enable this advice of not exporting from static libraries to be
automatically applied with minimal intervention by the programmer.
## It is Loaded, Works Yes?
TLDR: Just because it linked, doesn't mean it'll load even with
the right dependencies and the behavior of loaders are not
consistent between platforms.
So you have succesfully compiled and linked. Symbols that were
supposed to be exported were, and those that weren't weren't. So
it will work now yes? YES?
NOPE. We are not done yet.
Now we gotta talk about loading of shared libraries and ensuring
their state is valid.
But where does a loader look for a shared library to load? First
place is system directories which of course depends upon your
system configuration.
For POSIX systems it uses some environment variables to determine
auxiliary locations. It also looks in a special string within a
binary (executable and shared libraries) called ``RPATH``,
however keep in mind this will carry with the binary no matter
where its called or by who.
On
[Windows](https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order) and [OSX](https://developer.apple.com/library/archive/documentation/DeveloperTools/Conceptual/DynamicLibraries/100-Articles/UsingDynamicLibraries.html) it'll look in the current working directory by default too, not just system directories or the ``PATH`` variable.
Windows does support some
[customization](https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setdlldirectorya?source=recommendations) for the usage of launchers, that will allow at runtime to setup some additional paths.
So much variety in behavior of the system loader, how do we
ensure we have a consistent behavior that "just works" with our
build managers? Outside of the build manager we really can't do a
whole lot.
But what we can do is unify upon placing them into the same
directory as the executable and then letting the build manager
use the appropriete environment arguments to setup the lookup
paths to point to it. If all you are doing is wanting to run your
program that is great.
I have a [PR](https://github.com/dlang/dub/pull/2718) to add this
capability to dub, which has been a tad contentious for those who
are not me or Martin.
Of course all of this assumes you have all the dependencies setup
with no conflicts in place (such as versioning). If you don't
you're going to need a tool like
[Dependencies](https://github.com/lucasg/Dependencies) to figure
this one out the hard way.
## Unloading
TLDR: To keep your sanity, don't unload a shared library unless
your process is dieing.
Remember when I said shared libraries are not _isolated_
(sandboxed)? Yeah that. That is a bit of a problem...
If you unload a shared library you are putting your process into
an indeterminate state on if it could be corrupted. For this
reason I would not recommend unloading a shared library except in
one rather particular case.
If you can guarantee that a given shared library has not during
its existance been sharing its resources and you have not been
taking any pointers into it, you may unload it.
To work around this limitation of no sharing of resources, you
can use handles as long as they are not the integral
representation of a pointer and to convert them internally to a
pointer use a data structure to map it. A much slower approach,
but safer if you need to do unloading.
The simplest solution to all of this which is what I would
recommend, is to simply keep a shared library loaded but detach
them internally. So if you mess up you are not risking a program
crash. Just don't subvert your API that controls attachment and
it should work safely.
This approach takes care of both read only memory (functions,
globals, constant literals) as well as heap allocated memory.
## Initializing Your Shared Library
TLDR: A shared library that allows you to borrow resources it
owns, and borrows from another is full of failure modes that may
not be avoidable.
All platforms worth mentioning here support some method to run
initializers and deinitializers in your shared library after load
and before unload with priorities. In D this can be hooked using
the ``pragma(crt_constructor)`` and ``pragma(crt_destructor)``.
However we do not support priorities.
Windows has some additional support of initialization callbacks
via the ``DllMain`` function, however this will be covered in the
sub heading ``TLS Hooking``.
When a shared library is designed to work in isolation and not
take ownership of any resource it did not create for its own
internal use, there should be minimal concerns surrounding its
initialization and deinitialization, as long as they were never
exposed to other code, nor other code exposed to it.
See my prior point in ``Unloading`` regarding handles.
On the other hand when you have a shared library similar to
druntime that:
- Does not define its own initialization/deinitialization
functions that are automatically run (you must explicitly run
them).
- Owns threads that you can request, borrow and sets up its own
internal state.
- Can be informed of threads you own, but does not allow you to
add its internal state onto it (not necessarily required but
there is no function that you are supposed to call to make it
happen).
- Owns memory (GC) that you can borrow at your request.
- Borrows memory that it scans for GC memory.
- Runs other peoples code (module (de)constructors, unittests,
destructors) at potentially indeterminate times.
Every single one of these things could be the cause of your
programs corruption. Best case scenario is a segfault, but silent
program corruption is just as possible.
### TLS Hooking
TLDR: Only Windows offer hooking of threads, which supports zero
or more ``DllMain``'s and for druntime should be automatically
injected.
Having knowledge of when a thread is created or destroyed is
quite useful to have if your goal is to register threads to a
shared library, construction or destruction of your state.
Windows has this capacity in the form of a function called
[``DllMain``](https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain) this maps into a section inside of a the PE-COFF binary for [TLS callback functions](https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#tls-callback-functions) and enables a compiler to provide as many hook functions as desired to load/unload of binaries as well as on creation and destruction of threads.
This leads to a concern about the existance of a mixin template
in druntime called ``SimpleDllMain``. When druntime is built as a
shared library on Windows, it'll automatically be included.
However if you build a shared library that has druntime as a
static library this will not be handled for you and it could be
without using the ``DllMain`` function up.
If we offered a
[pragma](https://issues.dlang.org/show_bug.cgi?id=24532) to set a
function as a TLS callback function we could let druntime have
its own, remove the need for ``SimpleDllMain`` entirely.
Although in the above I say only Windows supports it, in recent
years C++ has introduced thread local variables and with that
destructor support. This might be
[hookable](https://issues.dlang.org/show_bug.cgi?id=23756),
although this would not solve the on thread creation hook and for
that reason it should be considered Windows only for the time
being.
### Scenario: Your Own Memory Allocator
TLDR: The order of deinitialization can matter between siblings
shared libraries, if you can avoid letting a sibling shared
library borrow resources from you, you should avoid it.
Scenario: you have a shared library sitting side by side as a
sibling to druntime, that has been told that druntime exists via
registration (see dub's ``injectSourceFiles`` as a way to do this
automatically) and you have your own memory allocator.
You want to tell the GC about any memory you allocate, because of
course somebody might want to put GC memory into it and you don't
want to let it get free'd.
So you tell the GC all about it by adding it as a range, no
problem right? You're being a good person! And you would be
rather mistaken when it comes time to do unloading...
See it is totally possible that your shared library gets
deinitialized after druntime does. And of course when you
deinitialize, you gotta tell druntime to remove those ranges!
This is one way to get a crash deep inside of the druntime's GC
without a way of knowing why.
Please do not ask me how I know about this, it wasn't a fun time
to debug this one.
A workaround to this is to add an additional initialization and
deinitialization call to druntime. This will increase the counter
internally and when you do _your_ call to it will let it die
proper. Making it so all your state has it gone, and all its
state about you is also gone.
Note: this works with the C constructor/destructor, so this is
running outside of the user start function.
### Scenario: Your Own Threads
TLDR: If you're going to do your own threads, don't forget to
register them with druntime and handle cyclic registration to and
from.
So you have decided to create your own thread abstraction, you
wrote it and it worked first time, well done! And now you have
gotten a user to try it; the program crashed once run. The horror!
Out of pure curiosity did you register the thread and then ran
the thread initialization code for module constructors and TLS?
Yes? Why of course you didn't, you didn't even know that druntime
was loaded in process. See ``Scenario: Your Own Memory
Allocator`` section for more information on registering druntime.
Okay now that you have done it and it runs, great job!
So tell me, has druntime registered its threads with you also?
No? Curious, that you wanted to build a thread abstraction
library but you only cared enough to write the code regarding the
threads that _you_ wanted. Still at least no other threads are
interacting with your code. What? That isn't the case? Oh no...
Okay so the needful has been done, you have a module constructor
and destructor that informs you of thread creation and
destruction by druntime. Super. But why are you getting stack
overflows now?
See you did the most intelligent thing possible, you registered
your thread with druntime, and druntime registered its thread
with your abstraction. Isn't that how its meant to be? Why yes,
yes it is meant to be like that. Except you created a bit of a
loop there...
After all that work, now it starts to work without failures,
assuming of course you didn't mess out an implementation detail
some place like I did. It's always fun to have to debug code
where an object gets deallocated and the same pointer gets
allocated for the same thing and you wonder why the state keeps
changing on you!
## Where Is Thy Runtime?
TLDR: Did you follow my advice in ``Unloading``, no? Well good
luck with that. If you have a runtime loaded don't have
duplicates of it, stick to a single shared library build of it.
I tried... I really did, I spent an entire day trying to write
this section. Fact is what this section was meant to talk about
is when multiple copies of a runtime are loaded into a process
with no knowledge of each other.
If the owned resources of a shared library never crossed the
boundary to other peoples code is followed as I recommended in
``Unloading`` then this section wouldn't matter. But of course
nobody does that, see [SDL](https://www.libsdl.org/),
[SQLite](https://www.sqlite.org/) or should I say pretty much
EVERY C LIBRARY IN ACTIVE USE. Oh and for anyone in doubt, how
about that COM eh? Ya know the C++ based remote process
communication, that uses heap allocated classes that underpins a
pretty significant portion of the Windows shell and Microsoft
products extension capabilities.
Okay rant over, hopefully everyone who has made it this far can
see that there is a risk here that I am trying to educate about.
So you have a library, a runtime of sorts. Lets call it druntime.
This runtime owns and loans out memory from it, and has callbacks
registered into it (destructors, module destructors ext.) as well
as memory registered into it (``ModuleInfo``, ``TypeInfo``). Not
only that but it also has system resources such as locks and
threads that it owns and loans out to other code. Sometimes it
even knows about system resources that other code has created
such as threads!
So this "druntime", you build it as a shared library and you have
multiple binaries depending upon it loaded into your process. You
load and unload, register and unregister all correctly. No
segfaults happen on start up and shutdown. Good job, I'm sure
that you have followed all of my advice that I have detailed in
the other sections of the article.
Alternatively you could have built this "druntime" into an
executable or shared library and you end up having a mix leading
you to have multiple copies loaded into your process. Only they
know nothing of each other. This is unfortunately a very real
possibility, after all where will you register your _runtime_
into?
Which one do you think is going to cause problems at
indeterminate points in time?
The second of course! Okay I lie it could be either but the
second one is almost guaranteed to result in problems that are
impossible to debug for the novice.
Problem is each "druntime" is silo'd, it has no knowledge of the
other, or have the ability to communicate with it. But lets say
you did have the ability to communicate which is a rather big if,
have you really got all the state ready to be communicatable
between them? What happens when it is time to unload? Different
version size mismatch, behavior changes fields ext. This of
course doesn't answer questions like whose memory allocator do
you use from that point on, who ends up owning threads, and how
do you detect ROM that no longer will exist (i.e. ``TypeInfo``).
You are just asking for trouble trying to merge them.
In ``Is a Dynamic Link Library a Shared Library?`` I explain the
difference between a library that has been silo'd versus
isolated. Where the latter is sandboxed and the former is merely
ignorant of what else is in the process.
So should you accept that they are silo'd because anything else
is a developmental nightmare even if you have been successful in
aggregating state so that it can be passed back and forth. Now
the question has become, have you crossed resources (even if it
was done accidently) that are owned from one "druntime" to
another "druntime" instance? Of course you did, because who
wouldn't? Its not like there is any protection from doing it. Go
ahead propose exploding the number of pointer types... See where
that gets ya.
You put one bit of memory into another bit of memory with each
being owned by a different GC, which of course doesn't know about
the other. Naturally the memory that went into the other has no
other references and its GC has gone ahead and collected it. Not
long after that you accessed it, oh hey segfault! What did you
expect? This is too easy to do by accident.
If you are going to have a runtime that has resources it owns
exposed to other code (RAM, handles such as a thread or lock)
don't duplicate that runtime. You are asking for trouble. Use a
shared library for this, not a mix of static libraries with
shared library builds of it.
## Who Needs a Scope Anyway?
TLDR: Go ahead be smart! Don't use shared libraries or static
libraries, go import only! See how quickly you kill off that
scope that depends on having state.
So you wanna be smart, you think that your project having _any_
binary is just a big ball of problems, so you're going import
only! Well aren't you clever!
Just to clarify some things first:
- Does it have any state? Threads, locks, globals, inter-thread
communication?
- Does it need any giant lookup tables, that should be in read
only memory and shared throughout a process?
- Will there be any symbols that cannot be templated? Or should I
have said will be a right pain to use if it were templated?
- Are you linking against a non-D library?
If you answered no to all of these questions, well
congratulations you can go import only!
What? You didn't answer no to all of these questions? What are
you trying to build, a whole new standard library or something?
Limiting yourself to import only requires you to limit your
scope. Good bye event loops, windowing, anything asynchronous.
While you can do these things, you will be limiting yourself
severely enough that your code will not look familiar to others.
So up to you, listen to my advice, use a shared library and have
a state that can be shared or don't and put a copy into every
binary, which might be fine if all you have is a single
executable.
Either way, good luck with that PhobosV3 event loop whilst still
being import only!
More information about the Digitalmars-d
mailing list