Null references (oh no, not again!)

Wed Mar 4 05:09:24 PST 2009

On 2009-03-04 06:04:33 -0500, Walter Bright <newshound1 at digitalmars.com> said:

> Daniel Keep wrote:
>> I need to know when that null gets stored, not when my code trips over
>> it and explodes later down the line.
> 
> Ok, I see the difference, but I've rarely had any trouble finding out 
> where the assignment happened. In fact, I can't remember ever having a 
> problem finding that.

While I can't contradict your personal experience,

>> Non-nullable types (or proxy struct or whatever) means the code won't
>> even compile if there's an untested path.  And if we do try to assign a
>> null, we get an exception at THAT moment, so we can trace back to find
>> out where it came from.
> 
> Yes, I understand that detecting bugs at compile time is better. But 
> there's a downside to this. Every reference type will have two subtypes 
> - a nullable and a non-nullable. We already have const, immutable and 
> shared. Throwing another attribute into the mix is not insignificant. 
> Each one exponentially increases the combinations of types, their 
> conversions from one to the other, overloading rules, etc.
> 
> Andrei suggests making a library type work for this rather than a 
> language attribute, but it's still an extra thing that will have to be 
> specified everywhere where used.

Well, if you care about the extra work for users of the language that 
would have specify wether each pointer can be null or not, I disagree 
that it's extra work. When you design an API, you have to specify to 
users of that API whether you accept null pointer or not. You should 
specify it in the documentation ("this argument must not be null", 
"this struct member must not be null", etc.), and add contracts in the 
code to enforce that in debug builds. That's a lot more extra work that 
adding an attribute to the pointer, where it'll be available both to 
the compiler and the documentation.

Where I work, we do C++ programming. All the time we use things like 
std::auto_ptr and boost::scoped_ptr to enforce proper ownership and 
deletion of everything (boost::shared_ptr and intrusive_ptr also helps 
for shared pointers). We only rarely use raw pointers. The reason for 
this? Because using the more verbose version ensures correctness, 
express the intent and how that pointer is to be used.

It's true that auto_ptr and cie. in C++ prevent a more dangerous 
problem than null dereferences: they make sure the memory isn't 
deallocated prematurally or never, preventing corruptions and leaks...

> There are a lot of optional attributes that can be applied to reference 
> types. At what point is the additional complexity not worth the gain?

Pretty good question. Your are the judge of that, and apparently you 
don't like adding complexity for the user. Well, I'm with you on that.

The thing is I think we're simplifying things by adding non-nullable 
types. With nullability annotations, you always know when you have to 
check for null or not (normally, you keep track of that in your mind 
anyway). And with static enforcement of nullability checks prior a 
dereference, you don't have to be extra careful before dereferencing: 
the compiler will tell you if you've forgotten something, so you can 
free your mind of these details and concentrate on the task at hand.

The cost: you must annotate all your nullable pointers. But considering 
the study Andrei has dug out, most pointers shouldn't be nullable. And 
as I pointed out above, annotating is something you should to do anyway 
in documentation and contracts. And I think that having non-nullable by 
default would make that cost negative: you gain static null dereference 
checks everywhere, and for each of these pointers you have less 
documentation and contracts to write, and instead only one third of 
these need to be annotated as nullable (hopefully with just one 
character to type).

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/