@trusted assumptions about @safe code

Wed May 27 12:48:46 UTC 2020

On 5/27/20 2:36 AM, ag0aep6g wrote:
> On 27.05.20 02:50, Paul Backus wrote:
>> On Tuesday, 26 May 2020 at 22:52:09 UTC, ag0aep6g wrote:
>>> But yeah, the @trusted function might rely on the @safe function 
>>> returning 42. And when it suddenly returns 43, all hell breaks loose. 
>>> There doesn't need to be any monkey business with unsafe aliasing or 
>>> such. Just an @safe function returning an unexpected value.
>>>
>>> I suppose the only ways to catch that kind of thing would be to 
>>> forbid calling @safe (and other @trusted?) functions from @trusted 
>>> (and @system?) code, or to mandate that the exact behavior of @safe 
>>> functions (including their return values) cannot be relied upon for 
>>> safety. Those would be really, really tough sells.
>>
>> All that's necessary is to have the @trusted function check that the 
>> assumption it's relying on is actually true:
>>
>> @safe int foo() { ... }
>>
>> @trusted void bar() {
>>      int fooResult = foo();
>>      assert(fooResult == 42);
>>      // proceed accordingly
>> }
>>
>> If the assumption is violated, the program will crash at runtime 
>> rather than potentially corrupt memory.
> 
> I.e., "the exact behavior of @safe functions (including their return 
> values) cannot be relied upon for safety". I think it's going to be hard 
> selling that to users. Especially, because there is no such requirement 
> when calling @system functions.
> 
> Say you have this code:
> 
>      void f() @trusted
>      {
>          import core.stdc.string: strlen;
>          import std.stdio: writeln;
>          char[5] buf = "foo\0\0";
>          char last_char = buf.ptr[strlen(buf.ptr) - 1];
>          writeln(last_char);
>      }
> 
> That's ok, right? `f` doesn't have to verify that `strlen` returns a 
> value that is in bounds. It's allowed to assume that `strlen` counts 
> until the first null byte.
> 
> Now you realize that you can calculate the length of the string more 
> safely than C's strlen does, so you change the code to:
> 
>      size_t my_strlen(ref char[5] buf) @safe
>      {
>          foreach (i; 0 .. buf.length) if (buf[i] == '\0') return i;
>          return buf.length;
>      }
>      void f() @trusted
>      {
>          import std.stdio: writeln;
>          char[5] buf = "foo\0\0";
>          char last_char = buf.ptr[my_strlen(buf) - 1];
>          writeln(last_char);
>      }
> 
> Nice. Now you're safe even if you forget to put a null-terminator into 
> the buffer.
> 
> But oh no, `my_strlen` is @safe. That means `f` cannot assume that the 
> returned value is in bounds. It now has to verify that. Somehow, it's 
> harder to call the @safe function correctly than the @system one.

I think this is not the way to view it. @safe code still should do what 
it's supposed to do. It's not any harder *or any easier* to call @safe code.

You are going to have to trust that whatever functions you call 
(@trusted, @safe, or @system) are following their spec. @safe has 
additional restrictions, so you can assume more. But the semantic 
meaning of things cannot be checked by the compiler, and those are the 
interesting things that cause bugs.

A more realistic example, and one that gets me all the time is something 
like indexOf. Does it return input.length or -1 if the item isn't found? 
Using the wrong expectation can lead to bad consequences. So it's 
important, no matter the safety of the indexOf function, to know what 
it's supposed to do, and base your review of trusted code on that knowledge.

Of course, with something like that, one could be extra cautious, and 
assert the value is within bounds if it's not the sentinel. You could be 
even more cautious and check that the index found has the sought-after 
element. And that's probably the right defensive way to do this. But 
who's going to do that? Most people will work under the assumption that 
indexOf does what it says it's going to do, and not worry about 
unittesting it on every call.

-Steve