Should (p - q) be disallowed in @safe code?
Timon Gehr
timon.gehr at gmx.ch
Fri Jan 2 02:00:29 UTC 2026
On 1/1/26 22:53, Walter Bright wrote:
> On 1/1/2026 12:47 PM, Timon Gehr wrote:
>> Well, that was the point. Vladimir had said that pointer subtraction
>> is free of side effects, but UB would *be* the side effect. And AFAIU
>> according to the C standard it can be UB. This does not merely mean
>> that the result could be nonsensical (which, for the record, would
>> *not* be a bug in C), it means the program can do whatever it wants.
>
> That "can do whatever it wants" is a correct interpretation, but it
> would be insane to deliberately set up a system that launched nuclear
> missiles upon encountering UB.
> ...
Usually if the UB does anything of note it is because an attacker is
exploiting the hole in the language semantics to, deliberately, make the
program do something actively malicious that was never intended by its
author.
It's a common misunderstanding that these kinds of scenarios are
hypothetical. UB just sucks in this way, whether you deliberately want
it to or not.
> I also object to common optimizations that interpret UB as license to
> delete the offending code path.
> ...
Whether it is deliberately interpreted in any way or not, the code has
to do _SOMETHING_ if the UB condition actually occurs, and it's unclear
how you would specify that optimizations are supposed to maintain that
specific behavior when it may not even be clear at a point in the
optimization pipeline what it will end up being in the end. Often
enough, it will end up being an exploitable weakness.
If you want optimizers to preserve the behavior of code that has UB in
it, you have to turn any potential of UB into an optimization blocker.
The optimizer's intermediate representation just does not carry the
final machine semantics for expressions with UB.
It's a fundamental problem of any language design with UB, not some sort
of conspiracy by evil compiler developers.
UB exists because language designers and programmers want/need power
(runnable low-level fast code fast) without responsibility (using formal
methods such as advanced type systems).
>
>> As long as it is defined behavior in D, keeping it `@safe` is
>> perfectly fine. But differences to C and C++ may nevertheless trip up
>> some implementations and violate memory safety, as backends were
>> developed with C and C++ in mind.
>>
>> `@safe` has to be consistent with the backend semantics. This means
>> either making certain constructs `@system` or ensuring all backends
>> compile them safely, or having a broken `@safe`.
>
> The backends do not, to my knowledge, have any awareness of @safe or
> @system.
I think this is true yet completely irrelevant to what I was saying.
> I don't see a scenario where not allowing p-q in @safe code
> would have any effect on the backend.
> ...
Not the point at all. I was reacting to Vladimir's statement that:
> I don't think so. An expression which calculates a `size_t` (or
`ptrdiff_t`) value without side effects is memory-safe.
>
> What you do with the index (valid or not) would be scrutinized by the
usual rules.
My point was that:
a) There does not actually seem to be any explicit documentation in the
D spec about pointer subtraction. If there is, I have not found it.
b) In some popular languages, `p-q` is UB if `p` and `q` point to
different memory objects.
c) It's hence possible that some D backends give UB to this expression
when according to your intention they should not.
d) This scenario is not implausible, I think it already happens for null
pointer dereferences that code that the frontend says is `@safe` is
treated as UB by some of the backends.
>
>> UB is mostly a glue/backend thing, it's about what the code *means*,
>> not about how it is type checked by the frontend. And backends are
>> often biased towards C and C++ semantics.
>>
>> There are other cases, e.g., with DMD a null dereference may be a
>> guaranteed segfault, but I think it's likely UB with GDC and LDC.
>>
>> It seems LDC even has the flag `-fno-delete-null-pointer-checks` to
>> turn off UB on null pointer dereference, which would indeed indicate
>> it is UB by default.
>
> DMD's optimizer can detect null pointer dereferences
The flag is somewhat of a misnomer, you might have to actually look into
its documentation.
> as a result of copy propagation, etc., and always gives a compile time error when it does.
Sure, when you can prove that a piece of code is always wrong to
execute, you can do that (and I think it's a good idea). Often you
however can't.
> Otherwise, it just dereferences the null pointer and whatever the CPU
> does with it happens.
> ...
And hence you are now stuck treating pointer dereferences as a
side-effecting operation. Some backends don't like doing that.
Another issue is that some targets will not trap at all and just treat 0
as a valid memory address. (Less relevant for DMD's supported targets.)
> What the proposal in this thread is about is extending the @safe
> semantics to not just be about memory safety, but about checking for
> common bugs where rewriting the code slightly to avoid it is practical.
>
I understand, but there are more than two positions here.
Your position: `p-q` is memory safe yet might be error prone and we
might want to start banning error prone constructs in `@safe` code even
though it was originally meant to be strictly about memory safety.
Vladimir's position: `p-q` is memory safe, hence there is no need to
reject it in `@safe` code.
My position: Wait, is `p-q` even _currently implemented_ in a memory
safe way? Where is it documented? What are the backends doing? There are
already cases where `@safe` code is treated as UB by some backends and
`p-q` might be among these cases.
More information about the Digitalmars-d
mailing list