Introducing Nullable Reference Types in C#. Is there hope for D, too?

Sun Nov 19 19:36:09 UTC 2017

On 19.11.2017 05:04, Walter Bright wrote:
> On 11/18/2017 6:25 PM, Timon Gehr wrote:
>> I.e., baseClass should have type Nullable!ClassDeclaration. This does 
>> not in any form imply that ClassDeclaration itself needs to have a 
>> null value.
> 
> Converting back and forth between the two types doesn't sound appealing.
> ...

I can't see the problem. You go from nullable to non-nullable by 
checking for null, and the other direction happens implicitly.

> 
>>> What should the default initializer for a type do?
>> There should be none for non-nullable types.
> 
> I suspect you'd wind up needing to create an "empty" object just to 
> satisfy that requirement. Such as for arrays of objects, or objects with 
> a cyclic graph.
> ...

Again, just use a nullable reference if you need null. The C# language 
change makes the type system strictly more expressive. There is nothing 
that cannot be done after the change that was possible before, it's just 
that the language allows to document and verify intent better.

> Interestingly, `int` isn't nullable, and we routinely use rather ugly 
> hacks to fake it being nullable, like reserving a bit pattern like 0, -1 
> or 0xDEADBEEF and calling it INVALID_VALUE, or carrying around some 
> other separate flag that says if it is valid or not. These are often 
> rich sources of bugs.
> 
> As you can guess, I happen to like null, because there are no hidden 
> bugs from pretending it is a valid value - you get an immediate program 
> halt - rather than subtly corrupted results.
> ...

Making null explicit in the type system is compatible with liking null. 
(In fact, it is an endorsement of null. There are other options to 
accommodate optional values in your language.)

> Yes, my own code has produced seg faults from erroneously assuming a 
> value was not null. But it wouldn't have been better with non-nullable 
> types, since the logic error would have been hidden

It was your own design decision to hide the error. This is not something 
that a null-aware type system promotes, and I doubt this is what you 
would be promoting if mainstream type systems had gone that route earlier.

> and may have been 
> much, much harder to recognize and track down.

No, it would have been better because you would have been used to the 
more explicit system from the start and you would have just written 
essentially the same code with a few more compiler checks in those cases 
where they apply, and perhaps you would have suffered a handful fewer 
null dereferences. Being able to document intent across programmers in a 
compiler-checked way is also useful, even if one manages to remember all 
assumptions that are valid about one's own code. Note that the set of 
valid assumptions may change as the code base evolves.

The point of types is to classify values into categories such that types 
in the same category support the same operations. It is not very clean 
to have a special null value in all those types that does not support 
any of the operations that references are supposed to support. 
Decoupling the two concepts into references an optionality gets rid of 
this issue, cleaning up both concepts.

> I wish there was a null 
> for int types.

AFAIU, C# will now have 'int?'.

> At least we sort of have one for char types, 0xFF. And 
> there's that lovely NaN for floating point! Too bad it's woefully 
> underused.
> ...

It can also be pretty annoying. It really depends on the use case. Also 
this is in direct contradiction with your earlier points. NaNs don't 
usually blow up.

> 
>>> I found this out when testing my DFA (data flow analysis) algorithms.
>>>
>>>    void test(int i) {
>>>      int* p = null;
>>>      if (i) p = &i;
>>>      ...
>>>      if (i) *p = 3;
>>>      ...
>>>    }
>>>
>>> Note that the code is correct, but DFA says the *p could be a null 
>>> dereference. (Replace (i) with any complex condition that there's no 
>>> way the DFA can prove always produces the same result the second time.)
>>
>> Yes, there is a way. Put in an assertion. Of course, at that point you 
>> are giving up, but this is not the common case.
> 
> An assertion can work, but doesn't it seem odd to require adding a 
> runtime check in order to get the code to compile?
> ...

Not really. The runtime check is otherwise just implicit in every 
pointer dereference (though here there happens to be hardware support 
for that check).

> (This is subtly different from the current use of assert(0) to flag 
> unreachable code.)
> ...

It's adding a runtime check in order to get the code to compile. ;)

> 
>> Also, you can often just write the code in a way that the DFA will 
>> understand. We are doing this all the time in statically-typed 
>> programming languages.
> 
> I didn't invent this case. I found it in real code; it happens often 
> enough. The cases are usually much more complex, I just posted the 
> simplest reduction. I was not in a position to tell the customer to 
> restructure his code, though :-)

I don't doubt that this happens. I'm just saying that often enough it 
does not. (Especially if the check is in the compiler.)

I'm not fighting for explicit nullable in D by the way. I'm mostly 
trying to dispel wrong notions of what it is.