Fixing C's Biggest Mistake

Fri Dec 30 04:01:09 UTC 2022

On 12/30/22 03:03, Walter Bright wrote:
> On 12/29/2022 12:45 PM, Adam D Ruppe wrote:
>> The alternative is the language could have prevent this state from 
>> being unanticipated at all, e.g. nullable vs not null types.
> 
> It can't really prevent it. What happens is people assign a value, any 
> value, just to get it to compile.

No, if they want a special state, they just declare that special state 
as part of the type. Then the type system makes sure they don't 
dereference it.

> I've seen it enough to not encourage that practice.
> ...

There is ample experience with that programming model and languages are 
generally moving in the direction of not allowing null dereferences. 
This is because it works. You can claim otherwise, but you are simply wrong.

> If there are no null pointers, what happens to designate a leaf node in 
> a tree?

E.g. struct Node{ Node[] children; }

> An equivalent "null" object is invented.

No, probably it would be a "leaf" object.

E.g.:

data BinaryTree = Inner BinaryTree BinaryTree | Leaf

Now none of the two cases are special. You can pattern match on 
BinaryTrees to figure out whether it is an inner node or a leaf. The 
compiler checks that you cover all cases. This is not complicated.

size tree = case tree of
    Inner t1 t2 -> size t1 + size t2 + 1
    Leaf -> 1

No null was necessary.

 > size (Inner Leaf (Inner Leaf Leaf))
5

> Nothing is really gained.
> ...

Nonsense. Compile-time checking is really gained. This is just a 
question of type safety.

> Null pointers are an excellent debugging tool. When a seg fault happens, 
> it leads directly to the mistake with a backtrace. The "go directly to 
> jail, do not pass go, do not collect $200" nature of what happens is 
> good. *Hiding* those errors happens with non-null pointers.
> ...

Not at all. You simply get those errors at compile time. As you say, 
it's an excellent debugging tool.

> Initialization with garbage is terrible.

Of course.

> I've spent days trying to find 
> the source of those bugs.
> 
> Null pointer seg faults are as useful as array bounds overflow exceptions.
> ...

Even array bounds overflow exceptions would be better as compile-time 
errors. If you don't consider that practical, that's fine, I guess it 
will take a couple of decades before people accept that this is a good 
idea, but it's certainly practical today for null dereferences.

> NaNs are another excellent tool. They enable, for example, dealing with 
> a data set that may have unknown values in it from bad sensors. 
> Replacing that missing data with "0.0" is a very bad idea.

This is simply about writing code that does not lie.

Current way:

Object obj; // <- this is _not actually_ an Object

Much better:

Object? obj; // <- Object or null made explicit

if(obj){
     static assert(is(typeof(obj)==Object)); // ok, checked
     // can dereference obj here
}

obj.member(); // error, obj could be null

The same is true for floats. It would in principle make sense to have an 
additional floating-point type that does not allow NaN. This is simply 
about type system expressiveness, you can still do everything you were 
able to do before, but the type system will be able to catch your 
mistakes early because you are making your expectations explicit across 
function call boundaries.

It just makes no sense to add an additional invalid state to every type 
and defer to runtime where it may or may not crash, when instead you 
could have just given a type error.