Attribute promises vs inference rules
Quirin Schroll
qs.il.paperinik at gmail.com
Wed Apr 17 17:06:29 UTC 2024
The spec is rather detailed on what operations are valid and
invalid in functions that are annotated `@safe`, `@nogc`, `pure`,
and/or `nothrow`. However, there is a difference between
operations that make the compiler error when an attribute is
specified and an invalid operation is used – or (equivalently)
make the compiler not infer the attribute in a context where
attributes are inferred – and operations that violate the
promises the attributes make.
Example:
```d
int x;
bool f(int* p) pure @safe
{
return p is &(x); // Error: `pure` function `f` cannot access
mutable static data `x`
}
```
This is not a bug: Indeed, `f` accesses `x` and indeed `x` is
mutable data. Only by pure happenstance, `f` only uses the
address of `x` which isn’t mutable, and never its value which is
mutable. If it stored `&x` in a local, it could write to `x`. The
fact that `f` doesn’t do that means that `f` is “morally” pure,
but it’s not recognized as `pure` by the attribute spec. Don’t
get me wrong, the spec could be changed so that accesses like
this would be allowed, but currently, it doesn’t, which serves as
a great example.
So, what about this:
```d
int x;
bool g(int* p) pure @safe
{
static impl(int* p) @safe { return p is &(x); }
enum pure_impl = () @trusted { return cast(bool function(int*
p) pure @safe)&(impl); }();
return pure_impl(p);
}
```
A cast that adds function attributes isn’t allowed by `@safe`,
but we have `@trusted` for that. The question now is: Is it
defined behavior if I cast `&impl` to `pure` using an explicit
cast? I don’t know and I also don’t know where to look. The
second one is an issue for D.
Let’s look at each attribute individually, in the order of (what
I presume) the easiest to the hardest to answer.
### What is morally `@nogc`?
My sense is: If it doesn’t allocate on the GC. Even if a function
can allocate conditionally, if you can ensure it won’t, you’re
good. Probably. The spec doesn’t say it, but anything else would
be a big, big surprise.
### What is morally `@safe`?
This attribute has the best answer because the question is
essentially: What can be annotated `@trusted`? It has no simple
answer, but at least there are discussions around it. Also,
because `@trusted` exists, such questions are easy to phrase.
### What is morally `nothrow`?
What `nothrow` is about can be readily guessed. It’s not actually
“cannot throw [anything]”, but rather “cannot throw
`Exception`s”. Close enough. In all honesty, I don’t know what is
“morally `nothrow`”, but if you asked me: “Function `foo` is not
annotated `nothrow`, but it simply won’t throw exceptions, can I
cast `&foo` to `nothrow`?” I’d answer: “Probably yes, but better
use
[`assumeWontThrow`](https://dlang.org/library/std/exception/assume_wont_throw.html).”
There could be some messy details, though. A `throw` function can
fail recoverably, so it must be called in a way that supports
stack unwinding; a function that can’t fail recoverably doesn’t.
It might be an issue, I don’t know.
### What is morally `pure`?
It’s not clear at all what `pure` promises exactly and what it
doesn’t. Contrast this to `nothrow` and especially `@nogc`, where
it might just be a single spec paragraph that’s missing. It may
seem as easy as: It doesn’t access mutable data. Remember the
initial example? It’s not so easy. Even if it were, the
guarantees that follow from “it doesn’t access mutable data” are
manifold: Unique construction (by a `pure` function that meets
some other criteria) allows implicit casts from mutable to
`immutable`. Some `pure` functions may be cached without one
being able to observe the difference. Some `pure` functions may
be run in parallel without requiring synchronization and other
fancy stuff.
Also consider GC allocation. A `pure` function is explicitly
allowed to allocate on the GC heap (unless it’s also `@nogc` of
course, but that’s orthogonal). How is that possible? The GC heap
is definitely global state!
Now, one could argue that there is only one GC, therefore every
(`pure`) function morally has a hidden parameter that provides
access to the GC, and a `pure` function may access a global
variable through a parameter. (In a sense, what `@nogc` morally
does (to a `pure` function) is remove this hidden parameter.) If
we’re comfortable arguing like that in the general case, the
rules of `pure` aren’t as trivial anymore. What about custom
global-state APIs that could be modeled similar to the GC?
What conditions does a global-state API have to meet such that
access to it is well-defined in a `pure` function? In my
estimation, nobody knows.
### Conclusion
For two of the four attributes, a spec paragraph is warranted.
For `@safe`, it’s already an ongoing quest to extend as much
UB-free code into the domain of `@safe`. For `pure`, there’s a
whole discussion pending of what should count as “morally pure”,
which casts are to `pure` are UB-free. This can be considered
part of the `@safe` discussion.
As for positions, there’s one extreme point: _Morally `pure` is
only what could have been annotated `pure` without change._ This
is probably a good starting point from a theoretical standpoint,
i.e. the spec could be explicit about it and say: “A pointer to a
function that isn’t annotated `pure` can be cast to a function
pointer type that’s additionally annotated `pure` if the pointee
function could have been annotated `pure` i.e. the programmer
merely ‘forgot’ to annotate where it was possible.” But what
about `f` from the initial example? It cannot be annotated
`pure`. Do we want to exclude it? That doesn’t seem very
practical. It would mean that `g` introduces UB and pose the
question: When exactly does `g` enter UB? Is the cast already UB
or does the ill-cast function have to be called?
More information about the Digitalmars-d
mailing list