Big picture on shared libraries when they go wrong, how?

Richard (Rikki) Andrew Cattermole richard at cattermole.co.nz
Fri May 10 15:21:15 UTC 2024


On 11/05/2024 1:27 AM, Atila Neves wrote:
>>> "you cannot tell the compiler that a module is not in your binary." - 
>>> isn't this exactly what happens with `export` on a declaration (as 
>>> opposed to a definition)? That is, my understanding is that `export` 
>>> with no body means `dllimport` and `export` with a body means 
>>> `dllexport`.
>>
>> Keep in mind that nobody that I know of is using it to mean this today.
> 
> Maybe the recommendation should then be that they should? Doesn't the 
> point still stand that "you cannot tell the compiler that a module is 
> not in your binary" isn't actually true? I saw in one issue where there 
> was a problem with variable declaration though, where 
> dllimport/dllexport was determined by the presence or not of an 
> initialiser, which is... yuck.

"you cannot tell the compiler that a module is not in your binary" is 
true, there is no syntax or cli flag to do this today. Note: this is for 
the entire module, its metadata (``ModuleInfo``) not just the user 
written symbols.

There probably should be syntax on the module to specify it being out of 
binary. However that would not be suitable for the majority of 
programmers and most definitely is not suitable for build managers as it 
would require it to modify/create the files. That could come later once 
we have more experience with reliable support.

But yes having an initializer or not, should not determine the symbol 
mode. I did my best to clean it up in a way that would keep things 
simple and not break the world.

This is why I've simplified things down to:

Use ``export`` + ``extern`` to go into ``DllImport`` mode.

Or rely on the external import path switch to set the ``extern`` for the 
majority of users. Which is ideal for things like the di generator or 
build managers ;)

Or use the dllimport override switch to set all symbols found from a 
module that is known to be out of binary as ``DllImport`` (helps with 
mixing some imports being in binary and some out).

>> Also a D module may be in binary, but it may have declarations that 
>> point to something that is not.
>>
>> See the bindings in druntime where it has D symbols.
>>
>> https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L1191
> 
> The relevant code is:
> 
>      private extern shared FILE[3] __sF;
>      @property auto stdin()() { return &__sF[0]; }
> 
> `__sF` is declared extern, i.e., not in binary. I don't understand what 
> the issue would be?

Yeah that's not the best example, but keep in mind that symbol is not in 
``DllImport`` mode, its ``Internal``.

And that right there is a problem.

If all those symbols were declared using positive annotation (like they 
should be), then anything that has D symbols like the structs such as:

https://github.com/dlang/dmd/blob/5fc02ba152ccaa71711f3ed84b6d44a2a940f206/druntime/src/core/stdc/stdio.d#L872

Would also be affected if the following was applied.

>> This is why you don't pretend that one symbol in ``DllImport`` mode 
>> means the entire module is because it may be a binding to something else.
> 
> Why would the entire module be dllimport?

Walter came up with this idea a while back, so I'm a tad defensive 
towards it.

https://github.com/dlang/dmd/pull/15298

Me and Martin had to really fight Walter on it, as it would mean 
breaking peoples builds in incredibly frustrating ways.

The problem is once you start making assumptions about if a module is in 
or out of your binary without direct instruction by the user, you can 
mess up your dependencies between modules at the minimum. So if you 
depend on another it may not initialize before you.

It needs to be explicit, or its going to ruin someones day pretty 
quickly (hence the external import path switch).

Here is a binding module that is compiled into your executable.

```d
module binding;

export extern void otherFunction();

shared static this() {
	import std.stdio;
	writeln("binding");
}
```

```d
module app;
import binding;

void main() {
	otherFunction();
}

shared static this() {
	import std.stdio;
	writeln("main");
}
```

``$ dmd main.d binding.d someImport.lib``

With Walter's idea the ``app`` module constructor could be called before 
``binding``'s does, which absolutely should not happen.

>> Note: there are plenty of examples of it which are not templated, like 
>> structs that have to be initialized using the D init array.
> 
> Yes, but I don't know how this is related to the above.

If you were to use Walter's approach to determining if a module is out 
of binary simply because there is a symbol in ``DllImport`` mode, and 
then applied it to all declarations (as if the dllimport override switch 
was set to ``DllImport`` everything) it'll cause link errors.

Basically a module can have some symbols out of binary whilst the rest 
are in. So you can't make this sort of assumption.

This is an unfortunate side effect of analyzing Walter's idea to its 
natural conclusion. Me getting a bit unhappy towards any suggestion of 
inferring out of binary status, its pain in waiting.

>>> Where does the need for "private but export" come from again? Is 
>>> there an equivalent in C++ (`static dllexport`?), or does this only 
>>> happen due to something specific to D like `T.init`?
>>
>> It happens because D confuses exportation which is a linker concept, 
>> with a language visibility concept.
> 
> My question is: when would I want to export a private symbol?

Q1: Should a given symbol be private, yes?
Q2: Does any template use the previous symbol?

A: That is why you would export a private symbol.

Everything in here needs to be exported and should have package 
visibility so that it is not available for anyone else to access: 
https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/console/internal/rawwrite.d

See some usage here: 
https://github.com/Project-Sidero/basic_memory/blob/main/source/sidero/base/console/internal/writer.d#L70

I seem to be severely lacking in my ability to explain that having 
templates be limited to accessing only public things is not a good idea. 
This has been a bit of an annoyance for me, it goes in the face of all 
my interests in program security by introducing uncertainty that has no 
reason to exist.

In practice it means people can go and mess around with your internal 
state without any language protection. You are better off using negative 
annotation and never touching positive. Its safer. A lot safer.

Needless to say they are in a few places in my code base:

https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/rawread.d#L126

https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/text/format/prettyprint.d#L67

https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/path/file.d#L1012

https://github.com/Project-Sidero/basic_memory/blob/65ee9d0c4b1d4bc666772b21d4afdec30835120c/source/sidero/base/logger.d#L447

Loggers, console read/write, are where I have hit it.

But I'm sure you could come up with other examples of where you don't 
want others directly touching your _internal_ symbols, but still need 
your code which may be compiled into another binary to touch them.

>> In C/C++ it uses a completely separate attribute to donate that it 
>> does not affect Member Access Control such as ``private``.
> 
> On Windows, C/C++ compilers use a non-standard extension to do so, but 
> yes. AFAIK (and I could well be wrong), one can't dllexport something 
> that's static? You'd have to put it in a header, and it'd be compiled 
> into the current translation unit anyway, so I also don't understand why 
> you'd want to.

Yes the extension is 
https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170

For D apart from maybe metadata, only templates can go into another binary.

>> But you can see this with things like templates, you want to access an 
>> internal symbol but don't want somebody else to? Yeah no, can't do 
>> that today.
>>
>> https://learn.microsoft.com/en-us/cpp/cpp/dllexport-dllimport?view=msvc-170
> 
> Assuming the link is supposed to elucidate the template comment, I don't 
> understand the relevance. Otherwise, what does "you want to access an 
> internal symbol but don't want somebody else to" mean?

```d
module database_access;

@safe:

void doAThing(T)() {
	iCanKillYourDatabase(false, T.sizeof);
}


/*private:*/
export:

void iCanKillYourDatabase(bool doWrongThing = true, size_t sizeOfThing) {
	if (doWrongThing)
		database.corruptSilently();
}
```

A bit dramatic (due to unrealistic nature), but that should get the 
point across that ``iCanKillYourDatabase`` should be private but also 
exported.

The kernel might not stop you from doing a bad thing, but D should be 
making it a lot harder by making ``iCanKillYourDatabase`` private.

Note: you can of course still gain access to it via ``dlopen``, but that 
is not ``@safe`` code and you would need to gain access to the symbol 
name before you could do that.

>> You have to be very explicit in c/c++ over this, we do not have that 
>> level of control (apart from saying do not export this symbol via 
>> ``@hidden``).
> 
> This could mean several things. We have the control over individual 
> symbols that are actually in the source code with `export`. Is the 
> comment above about things like T.init?

For generated symbols like ``T.init``, ``opCmp``, ``ModuleInfo`` ext., 
we have zero control over these currently. Either you use a linker 
script or you use negative annotation. Everyone I know of uses negative 
annotation (although I support both).

For other symbols we can control not to export per symbol, and if we 
want to export and have it be public then we can use the export keyword.

In C/C++ land, you control if its ``DllExport`` or ``DllImport`` with an 
attribute directly. With my DIP you would use ``export`` and ``export`` 
with ``extern`` to denote each.

>>> "By not exporting ModuleInfo and assuming it is available the 
>>> compiler introduces a hidden dependency on a generated symbol that 
>>> may not exist." - do we have an issue for that? I searched for 
>>> ModuleInfo in the issues but none of them looked like a match to me.
>>
>> Yes two. They are referenced in the article.
>>
>> Note: they are not duplicates.
>>
>> Okay I lie there is a bunch more.
> 
> Thanks!
> 
> On a somewhat related note, we use dlls at work and seem to have fixed 
> "everything" by using ldc and `-fvisibility=hidden 
> -dllimport=defaultLibsOnly`, as well as `-link-defaultlib-shared`.

``-link-defaultlib-shared`` sets the druntime to be a shared library 
(which lets face it should be the default).

``-dllimport=defaultLibsOnly`` all symbols for druntime and phobos are 
defaulting to ``DllImport`` but none others.

As for ``-fvisibility=hidden``, that would imply that you are using 
positive annotation in every code base. Which is curious considering 
Martin's prior work has been the exact opposite to this.

I don't know enough details of Symmetry's projects or how they are laid 
out to comment about them beyond the tidbits I get. Switching from 
negative to positive annotation would be a massive undertaking so the 
notion that you have switched to positive annotation is a statement I am 
having trouble coinciding it with what I know.

Perhaps the one you are referring to is a plugin with a known fixed 
public API? That being positive annotation and everything else being 
compiled in being hidden would make sense.


More information about the Digitalmars-d mailing list