Contracts, Undefined Behavior, and Defensive,Programming

Sat Jun 13 15:58:49 UTC 2020

Am Sat, 13 Jun 2020 15:04:45 +0000 schrieb Stanislav Blinov:

> On Saturday, 13 June 2020 at 08:52:18 UTC, Johannes Pfau wrote:
> 
>> 
------------------------------------------------------------------------
>> /**
>>  * returns (1 + 2 + ... + n) / n */
>> double sumDiv(ubyte n) @safe {
>>     double accum = 0;
>>
>>     for (size_t i = 0; i <= n; i++)
>>     {
>>         accum += i;
>>         if (i == n)
>>             return accum / i;
>>     }
>>     // DMD requires this to compile the code, cause it does not
>> detect
>>     // the above loop covers all possible inputs.
>>     // GDC and LDC optimizers remove this assert.
>>     assert(0);
>> }
> 
>> For n = 0, this function causes soft (library) undefined behavior
>> according to the article: 0.0/0 is defined to give NaN, but it's not a
>> valid result for the algorithm above.
> 
> According to the algorithm, as presented, NaN is a valid result, and
> this so-called "soft" UB does not exist.
> It is the documentation of `sumDiv` that is incorrect, or rather

Well, then the documentation could just state that 0 is not a valid 
input. This is just for illustration, the point is that the algorithm 
does not want to make any assumption about the result for n=0, so it 
should not be called with n=0.

> If `sumDiv` expects *the callers* to *never* pass 0, it must either:
> 
> - throw an Exception (which its interface allows)
> - throw an Error (and become nothrow)
> - profess so via its interface:

If you can encode such information in types, it has the benefit of the 
compiler being able to check it statically. But this misses the point, 
the article is about semantic constraints which are not encoded in types.

>> So we apply defensive programming and add this contract:
>> ----------------------------
>> double sumDiv(ubyte n)
>> in (n > 0)
> 
> Dropping @safe? Then yes, there may be UB. But if @safe is to be kept,
> there's a case to be made to error out on that contract as being an
> invalid contract.

Just a copy and paste error. Keep the @safe, it compiles and can cause 
hard UB.

What is your definition for an invalid contract? Even if you explicitly 
did if (n == 0) return; after the assert, the compiler would just remove 
it. The assert is perfectly fine if the algorithm is specified to not be 
defined for n=0. The problem is using assert() like assume() to optimize: 
assume(x) is unsafe when x is a function parameter with no additional 
restrictions. How could the compiler even detect or flag such an 
'invalid' contract?

> 
> There's a "but" though, which concerns that stipulation on `assert`
> which you quoted from the spec. But that "but" has to do with semantics
> of `sumDiv` being available during compilation, which is not always
> possible because compilers cop out to "opaque" object files too early
> (i.e. in case if it's a library function, compiled into a separate
> object file, with only the declaration present during compilation).
> However, since, according to Walter, I "misunderstand" what linkers do
> (nevermind the fact that it's very much still about compilation and not
> linking), I'll say no more on that particular subject. Suffice it to say
> that in such a case, indeed, conflating `assert` and `assume` is just
> wrong.

I'm not sure what you're arguing for? My whole point is that the spec 
("the compiler is free to assume the assert expression is true and 
optimize 
subsequent code accordingly.") is unsound and subverts @safe code. The 
example just shows how that can happen.

It is true that this argument can be generalized to be explained by the 
fact that @safe enforces it's guarantees based on the function interface 
(which is available in .di), whereas semantic contracts (asserts in 
function bodys) may not be available. See my  reply to Jesse for some 
more thoughts about that.

-- 
Johannes