Contracts, Undefined Behavior, and Defensive,Programming

Sat Jun 13 08:52:18 UTC 2020

Am Fri, 12 Jun 2020 23:46:38 -0400 schrieb Andrei Alexandrescu:

> A short and sweet paper:
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1743r0.pdf
> 
> It echoes Walter's opinions on contracts vs. runtime checking, to a tee.
> He has been derided on occasion in this group for such views. The paper
> is a good rationale from an independent source.

A good article. Although there have sometimes been discussions about it, 
I think the difference between input validation and contract checking is 
now well understood by most people participating in discussions here.

However, this article also gives some vocabulary which might allow us to 
discuss the assert/assume conflict, where there is no uniform position in 
the community, in a clearer and more precise way. Consider this example:

------------------------------------------------------------------------
/**
 * returns (1 + 2 + ... + n) / n 
 */
double sumDiv(ubyte n) @safe
{
    double accum = 0;

    for (size_t i = 0; i <= n; i++)
    {
        accum += i;
        if (i == n)
            return accum / i;
    }
    // DMD requires this to compile the code, cause it does not detect  
    // the above loop covers all possible inputs.
    // GDC and LDC optimizers remove this assert.
    assert(0);
}
------------------------------------------------------------------------

For n = 0, this function causes soft (library) undefined behavior 
according to the article: 0.0/0 is defined to give NaN, but it's not a 
valid result for the algorithm above. Please note that the if block in 
the loop will return for all possible input values. The assert(0) is only 
needed because of compiler limitations, but never executed.

So we apply defensive programming and add this contract:
----------------------------
double sumDiv(ubyte n)
in (n > 0)
----------------------------

This now nicely detects the error in debug mode. However, the D spec says 
"The compiler is free to assume the assert expression is true and optimize 
subsequent code accordingly." and "Undefined  Behavior: The subsequent 
execution of the program after an assert contract is false.": https://
dlang.org/spec/contracts.html
I didn't remember that the spec was that upfront about this. Let's see 
how this affects our example:

* If the compiler interprets in(n > 0) as assume(n > 0), this means that 
the first iteration of the for loop with i = 0 has no effect: The if 
condition is always false (n != 0) and the accum += 0 has no effect.
* So it may rewrite the loop bounds to start at i = 1
* If this optimized function is called with n = 0, the loop body and the 
return is never executed
* Remember that the assert(0) at the end could already be removed by the 
optimizer
* I guess all bets are off what happens now

So using definsive programming we turned soft UB into hard UB which will 
probably crash the program. It could also cause memory corruption, run 
into an infinite loop, .... And please note that all this even happens in 
a @safe function.

So because of this, I think it's a really bad idea to conflate assume and 
assert meanings.

-- 
Johannes