Should (p - q) be disallowed in @safe code?
Timon Gehr
timon.gehr at gmx.ch
Fri Jan 2 19:24:26 UTC 2026
On 1/2/26 18:53, Walter Bright wrote:
> On 1/1/2026 6:11 PM, Timon Gehr wrote:
>> On 1/1/26 23:14, Walter Bright wrote:
>> I understand, but your latest point was "just put `@trusted` on it".
>>
>> Let's say the frontend now treats `p-q` as `@system`, and there is not
>> even any documentation of what its semantics is supposed to be.
>
> It's semantics are subtract q from p and divide by the size of the
> pointed to type.
> ...
There is some conversion going on here that you did not mention, and in
C the subtraction is sometimes invalid. I understand how to subtract
pointers in e.g. x86 assembly, but the abstract semantics in a
high-level language is a different thing. E.g., there is no such thing
as a "memory object" at the assembly level.
>
>> Do you believe with this background, alternative backends will in the
>> future be more likely to:
>>
>> - treat `p-q` as UB when different memory objects are involved
>>
>> - treat `p-q` as defined behavior when different memory objects are
>> involved
>
> Let's step back a bit. I expect it to behave as a C backend would.
(What any given C backend does _in practice_ is yet another question.)
But it seems you'd like it to be UB sometimes. Then it must be `@system`.
> More
> precisely, I have read the C/C++ memory model specification. It is very
> carefully written and well done. I requested a license to copy it to use
> in the D specification, but my request was ignored.
>
> I could rewrite it to an equivalent definition, but that's a lot of work.
> ...
That's not really the point of contention, if you are saying "D pointer
arithmetic semantics is like C", that's a sufficient specification as
far as I am concerned. And then it immediately follows that `p-q` cannot
be allowed in `@safe` code.
> But still, D is going to adhere to it. It works, everyone understands
> it, and the existing backends are carefully tuned to match it.
> ...
Ok.
> All my proposal does is disallow pointer subtraction in @safe code. Code
> generation is not affected in any material way.
> ...
The point of contention is really not whether banning a construct will
affect codegen. The actual dependency is:
type checking <- semantics -> codegen
However, if you allow `p-q` in `@safe` code, assuming logical
consistency, we can infer an intent about semantics that will put
certain restrictions on code generation.
> It's the same thing as disallowing p+=1 in @safe code.
Maybe to you this is the same, but to me `p-q` and `p+=1` are materially
different: one yields an integer, the other one yields a potentially
invalid pointer. It is conceivable _in principle_ to have a language
semantics where `p-q` is defined behavior.
> The memory model does not change.
> ...
That's fine, but to allow `p-q` in `@safe` code with C semantics is
inconsistent with _the definition of `@safe`_. And now you are saying
that this is the _current behavior_. It seems something is broken, and
fixing it is a _design problem_.
There are two different ways to fix it:
- Make cross-memory-object `p-q` implementation-defined (as you claimed
in your OP was already the case), differing from C.
- Make cross-memory-object `p-q` UB (as you are claiming now is already
the case), then ban `p-q` from `@safe` code.
You can't ignore the intended semantics of your programming constructs
when deciding if they can be `@safe`, even if changing the type checker
to consider something `@safe` or not does not have a material effect on
code generation by itself.
>
>> I just think the overall effect of this will be to cause confusion
>> about what is allowed among all parties involved. I think it's better
>> to stick to banning language constructs from `@safe` if they can
>> actually exhibit UB.
>
> Isn't that what I proposed?
> ...
I am not able to tell, which is the problem. You are saying
contradictory things.
You so far made all of these claims:
- cross-memory-object `p-q` is implementation-defined in D
- `p-q` in D is like in C
- cross-memory-object `p-q` is UB in C.
One of these three statements must be false. I think the last one is
correct.
>
>> And yet it seems for `p-q` you differed.
>
> How did I differ? I am confused.
`p-q` is sometimes UB in C and hence not memory safe. You said `p-q` is
memory safe in D. Hence it would have to be different.
There is no such thing as "UB yet memory safe".
More information about the Digitalmars-d
mailing list