Should (p - q) be disallowed in @safe code?
Walter Bright
newshound2 at digitalmars.com
Sat Jan 3 02:29:44 UTC 2026
The bottom line here is why are we arguing about this? Haven't we agreed that
p-q should be disallowed in @safe code? The rest of this message you can ignore
if you like.
---------------------
On 1/2/2026 4:46 PM, Timon Gehr wrote:
> This is not about hardware.
Good, we can move on from that issue!
> The optimizer does not care to explicitly trash your memory on `p-q`, it's just
> a side effect of completely disregarding the case where `p` and `q` are unrelated.
```
int i,j;
p = &i;
q = &j;
x = p - q;
```
The compiler can detect that p-q would would be undefined behavior. A sane
compiler would issue an error message upon such detection. Note that the the C11
spec says not doing a "shall" means undefined behavior. Taking that literally
means any syntax/semantic error in your code can legitimately cause the compiler
to generate undefined behavior. But not a sane compiler.
And yes, I oppose optimizers that detect UB and just delete it. That's a
disservice to the users, who find out the hard way about this behavior, rather
than getting a useful error message.
If the compiler does not detect that error (which will be most cases), then it
will do the reasonable thing and just subtract the two numbers, which will not
cause memory corruption in any mainstream CPU.
> `p-q` in a C program can _realistically_ corrupt memory even if the CPU will
> never corrupt memory when subtracting addresses.
> This is not just a theoretical problem, UB is UB and it has caused problems in
> practice.
I know, but I haven't seen an example of it for `p-q`. It would be interesting
if you could devise one! The UB problems I've seen were for other constructions.
> You absolutely can make a more efficient CPU by adding UB to it that can cause
> it to destroy itself or corrupt other components of the system if you run the
> wrong program.
I don't know if that is possible for `p-q`. It's just a subtraction. It may very
well be possible for other UBs.
> Professionals just indeed don't do that, because for some reason
> hardware reliability is taken seriously while software reliability is not.
The reason is pretty simple. Remember the disaster with the Intel Pentium
floating point bug? Wow was that expensive! I bore some of that cost because I
had to add workarounds to the code generator. Software updates are a lot cheaper
than having to pry out everyone's CPU chip and replace it, and even so,
compilers had to assume they were running on a bad CPU.
> CPUs come with manufacturer warranties, software comes with EULAs that read
> "ABSOLUTELY NO WARRANTY OF FITNESS FOR ANY PARTICULAR PURPOSE".
The software industry would cease to exist without that clause.
> CPU manufacturers are using formal methods to verify their designs.
Formal methods have bugs, too. Though I agree that formal methods are highly
useful. I know how to set up DFA and such and get them right, but I can't say I
have expertise in formal methods. For example, I don't know how to prove that
DFA converges to a solution, though I know it does, because the paper I learned
it from says they proved it :-) and have never found it to not be true. Full
disclosure: I have no formal education in computer science, which you have
surely inferred by now!
> The CPU does not have a concept of "memory object" or "different memory
> objects". It usually does not even distinguish addresses from other machine-word
> integers.
It does with the segmented memory system of the IBM PC, and the banked memory
card add-ons. I wrote a software virtual memory system using banked memory and
segment registers. You didn't really want to use an offset larger than the
memory allocated to that segment!
But those designs are all obsolete now and irrelevant.
>> Unfortunately, the C spec didn't nail down certain behaviors, and so we have
>> different behaviors.
> This is analogous to implementation-defined behavior, not undefined behavior.
> The C spec has undefined behavior, it is not saying "do what the CPU does", it
> is saying "do whatever is expedient, e.g. so to make the program run fast".
The commercial reality is starting in the 80s CPU designs changed to be very
friendly to actual C behavior. The C spec doesn't say anything about expedience
or speed (that I recall).
More information about the Digitalmars-d
mailing list