Why does nobody seem to think that `null` is a serious problem in D?

Steven Schveighoffer schveiguy at gmail.com
Thu Nov 22 15:19:49 UTC 2018


On 11/20/18 6:14 PM, Johan Engelen wrote:
> On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven Schveighoffer wrote:
>> On 11/20/18 1:04 PM, Johan Engelen wrote:
>>>
>>> D does not make dereferencing on class objects explicit, which makes 
>>> it harder to see where the dereference is happening.
>>
>> Again, the terms are confusing. You just said the dereference happens 
>> at a.foo(), right? I would consider the dereference to happen when the 
>> object's data is used. i.e. when you read or write what the pointer 
>> points at.
> 
> But `a.foo()` is already using the object's data: it is accessing a 
> function of the object and calling it. Whether it is a virtual function, 
> or a final function, that shouldn't matter. There are different ways of 
> implementing class function calls, but here often people seem to pin 
> things down to one specific way. I feel I stand alone in the D community 
> in treating the language in this abstract sense (like C and C++ do, 
> other languages I don't know). It's similar to that people think that 
> local variables and the function return address are put on a stack; even 
> though that is just an implementation detail that is free to be changed 
> (and does often change: local variables are regularly _not_ stored on 
> the stack [*]).
> 
> Optimization isn't allowed to change behavior of a program, yet already 
> simple dead-code-elimination would when null dereference is not treated 
> as UB or when it is not guarded by a null check. Here is an example of 
> code that also does what you call a "dereference" (read object data 
> member):
> ```
> class A {
>      int i;
>      final void foo() {
>          int a = i; // no crash with -O
>      }
> }
> 
> void main() {
>      A a;
>      a.foo();  // dereference happens
> }
> ```

I get what you are saying. But in terms of memory safety *both results* 
are safe. The one where the code is eliminated is safe, and the one 
where the segfault happens is safe.

This is a tricky area, because D depends on a hardware feature for 
language correctness. In other words, it's perfectly possible for a null 
read or write to not result in a segfault, which would make D's 
allowance of dereferencing a null object without checking for null 
actually unsafe (now it's just another dangling pointer).

In terms of language semantics, I don't know what the right answer is. 
If we want to say that if an optimizer changes program behavior, the 
code must be UB, then this would have to be UB.

But I would prefer saying something like -- if a segfault occurs and the 
program continues, the system is in UB-land, but otherwise, it's fine. 
If this means an optimized program runs and a non-optimized one crashes, 
then that's what it means. I'd be OK with that result. It's like 
Schrodinger's segfault!

I don't know what it means in terms of compiler assumptions, so that's 
where my ignorance will likely get me in trouble :)

> These discussions are hard to do on a mailinglist, so I'll stop here. 
> Until next time at DConf, I suppose... ;-)

Maybe that is a good time to discuss for learning how things work. But 
clearly people would like to at least have a say here.

I still feel like using the hardware to deal with null access is OK, and 
a hard-crash is the best result for something that clearly would be UB 
otherwise.

-Steve


More information about the Digitalmars-d-learn mailing list