Contracts, Undefined Behavior, and Defensive,Programming

Sat Jun 13 15:38:03 UTC 2020

Am Sat, 13 Jun 2020 14:41:19 +0000 schrieb Jesse Phillips:

> On Saturday, 13 June 2020 at 08:52:18 UTC, Johannes Pfau wrote:
>>
>>
>> So using definsive programming we turned soft UB into hard UB which
>> will probably crash the program. It could also cause memory corruption,
>> run into an infinite loop, .... And please note that all this even
>> happens in a @safe function.
>>
>> So because of this, I think it's a really bad idea to conflate assume
>> and assert meanings.
> 
> The article talks about this. It states that soft undefined behavior can
> lead to hard UB.
> 
> I don't think it tries to argue this is OK. 

I think you're refering to the strlen example on page 4? It is true that 
soft UB **may** lead to hard UB in general. If this is caused by 
application code, as in that example, it is something @safe has to (and 
can) take care of.

In application code, soft UB may be limited to local effects. Consider an 
online banking application showing the current date somewhere: If some 
code manages to assign 25 to the hours part of the date, that certainly 
violates some semantic guarantees. But if the date is only shown to the 
user, the worst thing that'll happen is that the user may see an 
incorrect date. And if all code is @safe, we're still guaranteed not to 
get memory corruption (hard UB).

But now if I add an assert(day <= 24) somewhere with best defensive 
programming intentions, the compiler may turn that into hard UB now. And 
hard UB could even include memory corruption setting your account balance 
to a negative value.

As assert "[...] checks (along with any action taken should a check fail) 
are, however,never a part of a function’s contract, cannot be relied upon 
by the clients..." they subvert @safe checking when used as assumes:

An example:

void a(int n) @safe
{
    assert (n > 0);
}

void b() @safe
{
    a(-1);
}

The compiler can not prevent the a(-1) call in b, as it does not know the 
semantic invariant demanded by a(). However, in a(), the assert (n > 0) 
get's ransformed to assume(n > 0) in release mode, which will lead to 
safe code causing hard UB.

Without assume, points where soft UB can turn into hard UB should all be 
covered by @safe (at least for memory corruption).

To summarize and rephrase this: the encapsulation unit of @safe is the 
function boundary. The parameter definitions (including this pointer) and 
return type and their modifiers are the information the compiler uses to 
implement it's safety checks. For all possible combinations of these 
input values, the @safe function may not cause memory corruption.

asserts then narrow the function contract by describing semantic 
limitations and @safe is not able to enforce this, as this information is 
not part of the function signature. This is fine, as @safe will still 
prevent soft UB from escalating to hard UB. But if you now use assert() 
as assume(), you introduce hard UB for some possible input combinations, 
subverting @safe guarantees.

> [...]
I agree with everything else you said ;-)

-- 
Johannes