Why does nobody seem to think that `null` is a serious problem in D?

Tue Nov 20 18:04:08 UTC 2018

On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis 
wrote:
>
> For @safe to function properly, dereferencing null _must_ be 
> guaranteed to be memory safe, and for dmd it is, since it will 
> always segfault. Unfortunately, as understand it, it is 
> currently possible with ldc's optimizer to run into trouble, 
> since it'll do things like see that something must be null and 
> therefore assume that it must never be dereferenced, since it 
> would clearly be wrong to dereference it. And then when the 
> code hits a point where it _does_ try to dereference it, you 
> get undefined behavior. It's something that needs to be fixed 
> in ldc, but based on discussions I had with Johan at dconf this 
> year about the issue, I suspect that the spec is going to have 
> to be updated to be very clear on how dereferencing null has to 
> be handled before the ldc guys do anything about it. As long as 
> the optimizer doesn't get involved everything is fine, but as 
> great as optimizers can be at making code faster, they aren't 
> really written with stuff like @safe in mind.

One big problem is the way people talk and write about this 
issue. There is a difference between "dereferencing" in the 
language, and reading from a memory address by the CPU.
Confusing language semantics with what the CPU is doing happens 
often in the D community and is not helping these debates.

D is proclaiming that dereferencing `null` must segfault but that 
is not implemented by any of the compilers. It would require 
inserting null checks upon every dereference. (This may not be as 
slow as you may think, but it would probably not make code run 
faster.)

An example:
```
class A {
     int i;
     final void foo() {
      	import std.stdio; writeln(__LINE__);
         // i = 5;
     }
}

void main() {
     A a;
     a.foo();
}
```

In this case, the actual null dereference happens on the last 
line of main. The program runs fine however since dlang 2.077.
Now when `foo` is modified such that it writes to member field 
`i`, the program does segfault (writes to address 0).
D does not make dereferencing on class objects explicit, which 
makes it harder to see where the dereference is happening.

So, I think all compiler implementations are not spec compliant 
on this point.
I think most people believe that compliance is too costly for the 
kind of software one wants to write in D; the issue is similar to 
array bounds checking that people explicitly disable or work 
around.
For compliance we would need to change the compiler to emit null 
checks on all @safe dereferences (the opposite direction was 
chosen in 2.077). It'd be interesting to do the experiment.

-Johan