Memory safe in D

Thu Mar 14 18:31:38 UTC 2024

On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
> In general, we cannot guarantee that assert()'s won't trip at 
> runtime, nor buffer overflow exceptions. All we can do is say 
> we stop the program when it happens.

An assert is explicit in the source code, it's visible to the 
programmer. A null dereference can easily happen without anything 
in the code to suggest to the programmer that the program might 
abort. This is because quite often a pointer/reference is never 
null in a particular scope (e.g. function parameters), so 
programmers stop worrying about null, and sometimes do this when 
the pointer actually can be null. What is needed is:

1. A way of distinguishing between never null pointers and 
sometimes null pointers.
2. Have a syntactical opt-in way of force unwrapping a nullable 
pointer. This can even be a no-op so you still get the hardware 
checking efficiency.

> Actual memory corruption is infinitely worse than promptly 
> stopping the program when it detects an internal bug.

Absolutely, though most modern languages seem to have some 
support for null safety too.

> Consider the following:
> ```
> class A { void bar(); }
>
> void foo(int i) {
>     A a;
>     if (i) a = new A();
>     ...
>     if (i) a.bar();
> }
> ```
> What happens if we apply data flow analysis to determine the 
> state of `a` when it calls `bar()`? It will determine that `a` 
> has the possible values (`null`, new A()`). Hence, it will give 
> an error that `a` is possibly null at that point.

I think you don't need DFA at least as I understand it:

>    if (i) a = new A();

If `a` is non-nullable, this line could be made to error because 
there is no `else` branch that also initializes `a`. This is what 
cppfront does. non-nullable types need initialization.

>    if (i) a.bar();

If `a` is nullable, this line is an error because you are calling 
a method that needs an A when (from the DFA-less compiler's point 
of view) you might only have a null pointer.

To handle that, you either:
* `assert(a)` before `a.bar()`, and the compiler assumes `a` is 
not null and in release mode there is only a hardware null check.
* Call a function to force-unwrap the nullable type to a non-null 
type, e.g. `a.unwrap.bar()`. This function can be a no-op in 
terms of hardware, but requiring calling it makes the programmer 
aware that a possible null dereference (at least from the 
compiler POV) may occur.
* Rewrite the if statement to `if (a)`. That is actually better 
code because you would need to check that `i` hadn't changed in 
between the two if statements, which might be long, to understand 
the code.

I hope you agree that at least some of this is workable and 
beneficial.
Now, as regards D, I'm not sure the best way to add non-nullable 
types to the language. Even editions may not be enough, perhaps a 
compiler switch to do null checking.

Non-nullable types are also very useful for API documentation. 
It's common for docs to forget to say "don't pass null", the user 
has to check the source code if available. The API can't be 
misused when the type system actually carries the information 
about whether the pointer actually exists or not (info that 
should be absolutely fundamental to a good static type system).