How to make D resolve C++ symbols by mangling symbols with the Itanium ABI on Windows
Carl Sturtivant
sturtivant at gmail.com
Thu Feb 29 19:54:24 UTC 2024
On Monday, 26 February 2024 at 13:36:42 UTC, thumbgun wrote:
> I'm currently trying to call some C++ functions that were
> compiled by g++ (mingw). However g++ uses the Itanium ABI name
> mangling rules. dmd on Windows tries to link functions based on
> the MSVC name mangling rules.
> [...]
> Is there any way I can make dmd link to symbols mangled
> according to the Itanium ABI's rules on Windows?
Here's a simple way to do this with *no change to source code* in
either the dynamic library or the DMD project that is supposed to
dynamically link to it. Of course this doesn't not resolve
potential C++ calling convention issues (that don't exist for C),
but now anyone is in a position to investigate when they exist.
I made a tiny proof of concept and it works. For concreteness,
suppose the dynamic library is `libx.dll`, built with the
(mingw64) `gcc` installed with the latest
[MSYS2](https://www.msys2.org/) as that led to all the utilities
I needed and a bash command line.
Suppose also the DMD project executable when build will be
`main.exe` compiled from `main.d` and `other.d` and a D interface
file `header.di` containing the necessary declarations for using
`libx.dll`. I'll state the obvious below to make the explanation
complete and for snag free experimentation.
Suppose for a moment there's no mangling problem because
`libx.dll` is compiled from C source, not C++. I'll describe the
exact context and then show how to fix it up for C++ with the
mangling problem solved.
To dynamically link to `libx.dll` from a DMD executable
`main.exe` DMD needs to link an *implib* ([import
library](https://en.wikipedia.org/wiki/Dynamic-link_library#Import_libraries)) during the build of `main.exe` and such can be made from a *def file* (module definition file) — which is a text file — using a library manager that knows about the MSVC world that DMD inhabits. Let `libx.def` be a def file for `libx.dll`, and let `libx.lib` be an implib for `libx.dll`.
The def file would usually be created by `gcc` given
`-Wl,--output-def=libx.def` when `libx.dll` is linked. And an
implib can be created from it using `dlltool` which is
distributed with that mingw64 `gcc`.[¹](#one)
```
$ dlltool -D libx.dll -d libx.def -l libx.lib -m i386:x86-64
```
Alternatively, the MS librarian `lib` can be used.[²](#two)
[³](#three)
```
$ lib -nologo -machine:x64 -def:libx.def -out:libx.lib
```
Now when main.exe is built, it just needs to link to that import
library and we're in business.
```
$ dmd main.d other.d header.di libx.lib
$ ./main #works
```
Now suppose we move to C++. If we make an import library as
above, then a build of `main.exe` will not link, because the
`gcc`-mangled names in the implib `libx.lib` do not match the
MSVC-mangled names supplied by DMD.
*We can fix this by modifying the def file and producing an
implib containing the MSVC-mangled names in place of the
corresponding `gcc`-mangled names!*
An implib contains each name to link to paired with the
corresponding location of the function in the dynamic library
that name refers to. Concretely `libx.lib` contains each
`gcc`-mangled name paired with the location in `libx.dll` of the
corresponding function. So the problem is solved if the
`gcc`-mangled names are replaced by the corresponding
MSVC-mangled names in the implib `libx.lib`.
There are many ways to do this! However, there's a
[mechanism](https://learn.microsoft.com/en-us/cpp/build/reference/exports?view=msvc-170#remarks) in a def file to do just that.
Here's the def file `libx.def` for my toy `libx.dll` generated by
`g++ -shared libx.o -o libx.dll -Wl,--output-def=libx.def`.
```
EXPORTS
_Z11complicatedi @1
```
Here `_Z11complicatedi @1` is the `gcc`-mangle of `int
complicated(int)`. Unfortunately, `other.d` expects this function
to be mangled as `?complicated@@YAHH at Z`, as this is the
MSVC-mangle of `int __cdecl complicated(int)`[⁴](#four) and comes
from `extern(C++) int complicated(int);` in `header.di`.
Editing `libx.def` into
```
EXPORTS
?complicated@@YAHH at Z=_Z11complicatedi @1
```
substitutes the MSVC-mangled name on the left for the
`gcc`-mangled name on the right when generating the implib
`libx.lib`. Using the MS librarian as before and building
`main.exe` removes the linking error and the result just works.
*However while using `dlltool` or `llvm-dlltool` as before
produces implibs that satify the linker, the resulting `main.exe`
when run did nothing in my toy example, simply returning to the
prompt with no output as of 2024-02-29.*
A `libx.def` and hence `libx.lib` for any `main.exe` and
`libx.dll` with many substitution lines placed in the def file
could be mechanically generated for once and for all. Or
`libx.def` and `libx.lib` could be rebuilt on the fly as new
symbols are used while the DMD project is being written.
Using the MS `dumpbin` tool produces text from which MSVC-mangled
symbols can be extracted, along with their demanglings. So if the
DMD project is compiled to a lib using the -lib option so that it
builds when linkage would be broken then a table of
(unmangled,MSVC-mangled) name pairs for linkage can automatically
constructed from running `dumpbin` on the resulting `main.lib`
and tearing up the resulting text. Similarly, the utility `nm`
can be used to produce a table of (unmangled,`gcc`-mangled) pairs
from `libx.dll` and that combined with the text of `libx.def` to
produce the modified `libx.def` with the necessary additional
qualifiers as in the example above.
A script could do this and then `lib` run to build the import
library on the fly during a build. Or, if the library's bindings
are all in a D header file already, say `header.di` then that
could be used to produce the pairs containing unmangled and the
MSVC-mangled names once and for all, and the corresponding
`libx.def` file then used to produce the implib `libx.lib` that
could be endlessly used with `libx.dll`.
Lots of possibilities here!
There is a library distributed with mingw64 `gcc` to [demangle
MSVC-mangled
names](https://mingw-w64.sourceforge.net/libmangle/index.html),
though I did not use it. So in principle the substitutive def
file could be made using just nm to dump the MSVC-mangled
binaries, so no MS tools are needed to make it.
Of course what we really need to know is the extent to which
cross calling actually works for various C++ constructs. I'd be
grateful if anyone finds this out that they'd post it here. I'm
not a C++ fan, so I'm not the person to do this.
___
[[1]](#1) This worked with my toy example, but there are claims
online that dlltool is unreliable, in which case
[llvm-dlltool](https://github.com/ldc-developers/llvm-project/releases/download/ldc-v14.0.0/llvm-14.0.0-windows-x64.7z) might be better. They both have the same command line, and I could distinguish no difference between them in my toy examples.
[[2]](#2) A bash script to put the directory `lib.exe` is
resident in at the front of your MSYS2 path before executing it
is handy, so as to avoid polluting that deliberately isolated
path with MS related executables. This technique can be used for
other MS tools mentioned above. So e.g. in `~/bin/lib` made
executable could be the following with `VCBIN` appropriately set
in `~/.bash_profile` as an MSYS2 path obtained from the windows
path to `lib.exe`'s directory using the `cygpath` utility that
comes with MSYS2. Note that it says `lib.exe` in the script, not
`lib` to avoid accidental recursion.
```
#!/bin/bash
PATH="$VCBIN:$PATH"
lib.exe "$@"
```
[[3]](#3) Avoid using DOS style options like e.g. `/nologo` in
favor of unix style options like `-nologo` because MSYS2 tries to
helpfully modify command lines and regards `/nologo` as a an
MSYS2 path which will be converted to a Windows path before
executing the command.
[[4]](#4) It seems this is because `__cdecl` is the default
calling convention for (mingw64) `gcc` and DMD's `extern(C++)`
assumes this, and MSVC-mangling always includes the calling
convention in the signature being mangled, even though
`gcc`-mangling does not if it is the default of `__cdecl`.
More information about the Digitalmars-d
mailing list