[RFC] Throwing an exception with null pointers

Richard (Rikki) Andrew Cattermole richard at cattermole.co.nz
Wed Apr 16 19:56:05 UTC 2025


On 17/04/2025 6:18 AM, Walter Bright wrote:
> I confess I don't understand the fear behind a null pointer.
> 
> A null pointer is a NaN (Not a Number) value for a pointer. It's similar 
> (but not exactly the same behavior) as 0xFF is a NaN value for a 
> character and NaN is a NaN value for a floating point value.

Agreed.

But unlike floating point, pointer issues kill the process.

They invalidate the task at hand.

> It means the pointer is not pointing to a valid object. Therefore, it 
> should not be dereferenced.

If you write purely @safe code that isn't possible.

Just like what .net guarantees.

> To dereference a null pointer is:
> 
> A BUG IN THE PROGRAM

Agreed, the task has not got the ability to continue and must stop.

A task is not the same thing as a process.

> When a bug in the program is detected, the only correct course of action 
> is:
> 
> GO DIRECTLY TO JAIL, DO NOT PASS GO, DO NOT COLLECT $200
> 
> It's the same thing as `assert(condition)`. When the condition evaluates 
> to `false`, there's a bug in the program.

You are not going to like what the unittest runner is doing then.

https://github.com/dlang/dmd/blob/d6602a6b0f658e8ec24005dc7f4bf51f037c2b18/druntime/src/core/runtime.d#L561

> A bug in the program means the program has entered an unanticipated 
> state. The notion that one can recover from this and continue running 
> the program is only for toy programs. There is NO WAY to determine if 
> continuing to run the program is safe or not.

Yes, that is certainly possible in a lot of cases.

We are in total agreement that the default should always be to kill the 
process.

The problem lies in a very specific scenario where @safe is being used 
heavily, where logic errors are extremely common but memory errors are not.

I want us to be 100% certain that a read barrier cannot function as a 
backup plan to DFA language features. If it can, it will give a better 
user experience then just DFA, we've seen what happens when you try to 
solve these kinds of problems exclusively with DFA, it shows up as 
DIP1000 cannot be turned on by default.

If the end result is that we have to recommend the slow DFA exclusively 
for production code then so be it. I want us to be certain that we have 
no other options.

> I did a lot of programming on MS-DOS. There is no memory protection 
> there. Writing through a null pointer would scramble the operating 
> system tables, which meant the operating system would do something 
> terrible. There were many times when it literally scrambled my hard 
> disk. (I made lots of backups.)

As you know I'm into retro computers, so yeah I'm familiar with not 
having memory protection and the consequences thereof.

> If you haven't had this pleasure, it may be hard to realize what a 
> godsend protected memory is. A null pointer no longer requires 
> reinstalling the operating system. Your program simply quits with a 
> stack trace.
> 
> With the advent of protected mode, I immediately ceased all program 
> development in real mode DOS. Instead, I'd fully debug it in protected 
> mode, and then as the very last step I'd test it in real mode.

I've read your story on this in the past and believed you the first time.

> Protected mode is the greatest invention ever for computer programs. 
> When the hardware detects a null pointer dereference, it produces a seg 
> fault, the program stops running and you get a stack trace which gives 
> you the best chance ever of finding the cause of the seg fault.

You don't always get a stack trace.

Nor does it allow you to fully report to a reporting daemon what went 
wrong for diagnostics.

What Windows does instead of a signal, is to have it throw an exception 
that then gets caught right at the top. This then triggers the reporting 
daemon kicking in. It allows for catching, filtering and adding of more 
information to the report. Naturally we can't support it due to 
exceptions...

At the OS level things have progressed from simply segfaulting out, even 
in the native world.

https://learn.microsoft.com/en-us/windows/win32/api/werapi/nf-werapi-werregisterruntimeexceptionmodule

> A lovely characteristic of seg faults is they come FOR FREE! There is 
> zero cost to them. They don't slow your program down at all. They do not 
> add bloat. It's all under the hood.
> 
> The idea that a null pointer is a billion dollar mistake is just 
> ludicrous to me. The real mistake is having unchecked arrays, which 
> don't get hardware protection, and are the #1 source of malware 
> injection problems.

While I don't agree that it was a mistake (token values are just as 
bad), and that is his name for it.

I view it the same way as I view coroutine coloring.

Its a feature to keep operating environments sane. But by doing so it 
causes pain and forces you to deal with the problem rather than let it 
go unnoticed.

Have a read of the show notes: 
https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/

"27:40 This led me to suggest that the null value is a member of every 
type, and a null check is required on every use of that reference 
variable, and it may be perhaps a billion dollar mistake."

None of this is new! :)

> Being unhappy about a null pointer seg fault is like complaining that 
> the seatbelt left a bruise on your body as it saved you from your body 
> being broken (this has happened to me, I always always wear that 
> seatbelt!).

Never happened to me, and I still wear it.

Doesn't mean I want to be in a car that is driven with hard stops that 
is in the drivers control to not do.

> Of course, it is better to detect a seg fault at compile time. Data Flow 
> Analysis can help:
> 
> ```d
> int x = 1;
> void main()
> {
>      int* p;
>      if (x) *p = 3;
> }
> ```
> Compiling with `-O`, which enables Data Flow Analysis:
> ```
> dmd -O test.d
> Error: null dereference in function _Dmain
> ```

Right, local information only.

Turns out even the C++ folks are messing around with frontend DFA for 
this :/ With cross-procedural information in AST.

> Unfortunately, DFA has its limitations that nobody has managed to solve 
> (the halting problem), hence the need for runtime checks, which the 
> hardware does nicely for you.
> 
> Fortunately, D is powerful enough so you can make a non-nullable type.

I've considered the possibility of explicit boxing.

With and without compiler forcing it (by disallowing raw pointers and 
slices).

Everything we can do with boxing using library types, can be done better 
with the compiler. Including making sure that it actually happens.

I've seen what happens if we force boxing rather than doing something in 
the language in my own stuff. The amount of errors I have with my 
@mustuse error type is staggering. We gotta get a -betterC compatible 
solution to exceptions that isn't heap allocated or using unwinding 
tables ext.

It would absolutely poor engineering to try to convince anyone to box 
raw pointers let alone being the recommended or required solution as 
part of PhobosV3. There has to be a better way.

> In summary, the notion that one can recover from an unanticipated null 
> pointer dereference and continue running the program is a seriously bad 
> idea. There are far better ways to make failsafe systems. Complaining 
> about a seg fault is like complaining that a seatbelt left a bruise 
> while saving you from being maimed.

Program != task.

No one wants the task to continue after a null dereference occurs. We 
are not in disagreement. It must attempt to cleanup (if segfault handler 
fires then straight to death the process goes) and die.

We are not as far off as it might appear.




More information about the Digitalmars-d mailing list