Contracts, Undefined Behavior, and Defensive,Programming

Sat Jun 13 15:04:45 UTC 2020

On Saturday, 13 June 2020 at 08:52:18 UTC, Johannes Pfau wrote:

> ------------------------------------------------------------------------
> /**
>  * returns (1 + 2 + ... + n) / n
>  */
> double sumDiv(ubyte n) @safe
> {
>     double accum = 0;
>
>     for (size_t i = 0; i <= n; i++)
>     {
>         accum += i;
>         if (i == n)
>             return accum / i;
>     }
>     // DMD requires this to compile the code, cause it does not 
> detect
>     // the above loop covers all possible inputs.
>     // GDC and LDC optimizers remove this assert.
>     assert(0);
> }

> For n = 0, this function causes soft (library) undefined 
> behavior according to the article: 0.0/0 is defined to give 
> NaN, but it's not a valid result for the algorithm above.

According to the algorithm, as presented, NaN is a valid result, 
and this so-called "soft" UB does not exist. It is the 
documentation of `sumDiv` that is incorrect, or rather, 
incomplete - it presupposes `n` being non-zero, but otherwise 
does not make any statement wrt the interface (i.e. usage) of the 
function, so we shall turn to the interface. The function is 
@safe, so it promises to handle all safe inputs, which certainly 
include 0, without causing UB, and only return safe values, which 
NaN is. Optimizers, therefore, must not "optimize" this function 
into causing "hard" UB (such as removing the 0th iteration 
without inserting a `ret`).

If we only saw the declaration, without definition, i.e.:

/**
  * returns (1 + 2 + ... + n) / n
  */
double sumDiv(ubyte n) @safe;

Nowhere is it stated what happens if input is 0. But still, it 
has a safe interface, it is @safe, and so must return a safe 
value, which NaN is. No UB, "soft" or "hard".

If `sumDiv` expects *the callers* to *never* pass 0, it must 
either:

- throw an Exception (which its interface allows)
- throw an Error (and become nothrow)
- profess so via its interface:

> double sumDiv(NonZero!ubyte n) @safe @nogc nothrow;
> //or, if it must never return a NaN:
> NeverNaN!double sumDiv(NonZero!ubyte n) @safe @nogc 
> /+/*perhaps*/nothrow+/;

...given hypothetical `NonZero` and `NeverNaN` types. Then the 
responsibility of checking for 0 is delegated away from the 
function body, and optimizers may indeed @safe-ly remove the 0th 
iteration.

> So we apply defensive programming and add this contract:
> ----------------------------
> double sumDiv(ubyte n)
> in (n > 0)

Dropping @safe? Then yes, there may be UB. But if @safe is to be 
kept, there's a case to be made to error out on that contract as 
being an invalid contract.

There's a "but" though, which concerns that stipulation on 
`assert` which you quoted from the spec. But that "but" has to do 
with semantics of `sumDiv` being available during compilation, 
which is not always possible because compilers cop out to 
"opaque" object files too early (i.e. in case if it's a library 
function, compiled into a separate object file, with only the 
declaration present during compilation). However, since, 
according to Walter, I "misunderstand" what linkers do (nevermind 
the fact that it's very much still about compilation and not 
linking), I'll say no more on that particular subject. Suffice it 
to say that in such a case, indeed, conflating `assert` and 
`assume` is just wrong.