Should (p - q) be disallowed in @safe code?

Timon Gehr timon.gehr at gmx.ch
Fri Jan 2 02:00:29 UTC 2026


On 1/1/26 22:53, Walter Bright wrote:
> On 1/1/2026 12:47 PM, Timon Gehr wrote:
>> Well, that was the point. Vladimir had said that pointer subtraction 
>> is free of side effects, but UB would *be* the side effect. And AFAIU 
>> according to the C standard it can be UB. This does not merely mean 
>> that the result could be nonsensical (which, for the record, would 
>> *not* be a bug in C), it means the program can do whatever it wants.
> 
> That "can do whatever it wants" is a correct interpretation, but it 
> would be insane to deliberately set up a system that launched nuclear 
> missiles upon encountering UB.
> ...

Usually if the UB does anything of note it is because an attacker is 
exploiting the hole in the language semantics to, deliberately, make the 
program do something actively malicious that was never intended by its 
author.

It's a common misunderstanding that these kinds of scenarios are 
hypothetical. UB just sucks in this way, whether you deliberately want 
it to or not.

> I also object to common optimizations that interpret UB as license to 
> delete the offending code path.
> ...

Whether it is deliberately interpreted in any way or not, the code has 
to do _SOMETHING_ if the UB condition actually occurs, and it's unclear 
how you would specify that optimizations are supposed to maintain that 
specific behavior when it may not even be clear at a point in the 
optimization pipeline what it will end up being in the end. Often 
enough, it will end up being an exploitable weakness.

If you want optimizers to preserve the behavior of code that has UB in 
it, you have to turn any potential of UB into an optimization blocker. 
The optimizer's intermediate representation just does not carry the 
final machine semantics for expressions with UB.

It's a fundamental problem of any language design with UB, not some sort 
of conspiracy by evil compiler developers.

UB exists because language designers and programmers want/need power 
(runnable low-level fast code fast) without responsibility (using formal 
methods such as advanced type systems).

> 
>> As long as it is defined behavior in D, keeping it `@safe` is 
>> perfectly fine. But differences to C and C++ may nevertheless trip up 
>> some implementations and violate memory safety, as backends were 
>> developed with C and C++ in mind.
>>
>> `@safe` has to be consistent with the backend semantics. This means 
>> either making certain constructs `@system` or ensuring all backends 
>> compile them safely, or having a broken `@safe`.
> 
> The backends do not, to my knowledge, have any awareness of @safe or 
> @system.

I think this is true yet completely irrelevant to what I was saying.

> I don't see a scenario where not allowing p-q in @safe code 
> would have any effect on the backend.
> ...

Not the point at all. I was reacting to Vladimir's statement that:

 > I don't think so. An expression which calculates a `size_t` (or 
`ptrdiff_t`) value without side effects is memory-safe.
 >
 > What you do with the index (valid or not) would be scrutinized by the 
usual rules.

My point was that:

a) There does not actually seem to be any explicit documentation in the 
D spec about pointer subtraction. If there is, I have not found it.

b) In some popular languages, `p-q` is UB if `p` and `q` point to 
different memory objects.

c) It's hence possible that some D backends give UB to this expression 
when according to your intention they should not.

d) This scenario is not implausible, I think it already happens for null 
pointer dereferences that code that the frontend says is `@safe` is 
treated as UB by some of the backends.

> 
>> UB is mostly a glue/backend thing, it's about what the code *means*, 
>> not about how it is type checked by the frontend. And backends are 
>> often biased towards C and C++ semantics.
>>
>> There are other cases, e.g., with DMD a null dereference may be a 
>> guaranteed segfault, but I think it's likely UB with GDC and LDC.
>>
>> It seems LDC even has the flag `-fno-delete-null-pointer-checks` to 
>> turn off UB on null pointer dereference, which would indeed indicate 
>> it is UB by default.
> 
> DMD's optimizer can detect null pointer dereferences

The flag is somewhat of a misnomer, you might have to actually look into 
its documentation.

> as a result of copy propagation, etc., and always gives a compile time error when it does.

Sure, when you can prove that a piece of code is always wrong to 
execute, you can do that (and I think it's a good idea). Often you 
however can't.

> Otherwise, it just dereferences the null pointer and whatever the CPU 
> does with it happens.
> ...

And hence you are now stuck treating pointer dereferences as a 
side-effecting operation. Some backends don't like doing that.

Another issue is that some targets will not trap at all and just treat 0 
as a valid memory address. (Less relevant for DMD's supported targets.)

> What the proposal in this thread is about is extending the @safe 
> semantics to not just be about memory safety, but about checking for 
> common bugs where rewriting the code slightly to avoid it is practical.
> 

I understand, but there are more than two positions here.

Your position: `p-q` is memory safe yet might be error prone and we 
might want to start banning error prone constructs in `@safe` code even 
though it was originally meant to be strictly about memory safety.

Vladimir's position: `p-q` is memory safe, hence there is no need to 
reject it in `@safe` code.

My position: Wait, is `p-q` even _currently implemented_ in a memory 
safe way? Where is it documented? What are the backends doing? There are 
already cases where `@safe` code is treated as UB by some backends and 
`p-q` might be among these cases.


More information about the Digitalmars-d mailing list