Should (p - q) be disallowed in @safe code?

Timon Gehr timon.gehr at gmx.ch
Sat Jan 3 00:46:07 UTC 2026


On 1/3/26 01:00, Walter Bright wrote:
> On 1/2/2026 2:54 PM, Timon Gehr wrote:
>> On 1/2/26 22:03, Walter Bright wrote:
>>> It seems we are in full agreement that p-q should be disallowed in 
>>> @safe code, which is my proposal here.
>>> ...
>>
>> I am happy with each of these two outcomes:
>> 1. `p-q` is `@safe`, implementation-defined.
>> 2. `p-q` can be UB, must be `@system`.
>>
>> So, works for me.
>>
>>> BTW, p-q is not a memory safety issue.
>>
>> Any type of UB is a memory safety issue.
> 
> You are obviously correct.

Underlying this admission is your utterly wrong claim, namely that it is 
a theoretical issue without practical significance.

> But using known computers, it is not a memory safety measure.

What "known computers" are doing at the machine level is only part of 
the puzzle. You can't just ignore "known compilers".

This is not about hardware.

> I don't see any reason anyone would implement p-q such 
> that it trashes memory or sets the CPU on fire.

Compiler passes just do what they do, assuming things like that if you 
see `p-q` then `p` and `q` are pointing to the same memory object.

Garbage in, garbage out. Wrong assumptions entering optimizers can and 
do cause befuddling miscompilation.

The optimizer does not care to explicitly trash your memory on `p-q`, 
it's just a side effect of completely disregarding the case where `p` 
and `q` are unrelated.

> Maybe what actually happens should be documented, to make it "implementation defined", but 
> I'm not in a position to authoritatively document what CPUs do.
> ...

UB does not care about what CPUs do. Even saying "it will do whatever 
the CPU does in this and this situation" is much, much safer than saying 
"this is UB". However, most backends made for C will not be able to 
implement this semantics while still performing optimizations.

> Dereferencing random pointers, on the other hand, can realistically 
> corrupt memory. This is why pointer arithmetic is not allowed in @safe 
> code.
> ...

`p-q` in a C program can _realistically_ corrupt memory even if the CPU 
will never corrupt memory when subtracting addresses.

This is not just a theoretical problem, UB is UB and it has caused 
problems in practice.

>> Any assumption that any type of UB is benign must rely on additional 
>> information about specific backends. So what you claim may be true 
>> with DMD, but that is about the extent of it.
> 
> I can't see a professionally designed CPU catching fire or corrupting 
> memory by subtracting two unrelated pointers. One would have to add more 
> transistors to make that happen.

You absolutely can make a more efficient CPU by adding UB to it that can 
cause it to destroy itself or corrupt other components of the system if 
you run the wrong program. Professionals just indeed don't do that, 
because for some reason hardware reliability is taken seriously while 
software reliability is not.

CPUs come with manufacturer warranties, software comes with EULAs that 
read "ABSOLUTELY NO WARRANTY OF FITNESS FOR ANY PARTICULAR PURPOSE".

CPU manufacturers are using formal methods to verify their designs.

> Nobody would buy such a machine.
> ...

This is not about the CPU, it's about compilers.

> Current CPUs are what they are. We live with that, and we trade off 
> performance for some level of unpredictable failure.
> 
>> I would expect a CPU to just do `i<<(j&31)`.
> 
> The X86_64 and Aarch64 give different results, I ran into that bug.
> 
>> The C abstract machine is however not the CPU.
> 
> CPU design has very much followed C semantics since the 80s. 

The CPU does not have a concept of "memory object" or "different memory 
objects". It usually does not even distinguish addresses from other 
machine-word integers.

> Unfortunately, the C spec didn't nail down certain behaviors, and so we 
> have different behaviors.
> 

This is analogous to implementation-defined behavior, not undefined 
behavior. The C spec has undefined behavior, it is not saying "do what 
the CPU does", it is saying "do whatever is expedient, e.g. so to make 
the program run fast".


More information about the Digitalmars-d mailing list