[Not really OT] Crowdstrike Analysis: It was a NULL pointer from the memory unsafe C++ language.
Richard (Rikki) Andrew Cattermole
richard at cattermole.co.nz
Sat Jul 27 22:49:34 UTC 2024
On 28/07/2024 9:19 AM, Walter Bright wrote:
> On 7/27/2024 12:00 PM, Richard (Rikki) Andrew Cattermole wrote:
>> 1. You made the decision. You had to consider what would happen in
>> this situation. The question was asked, what happens should this be
>> null? If you want to assert that's fine. What is not fine is making an
>> assumption and never state it, never have it proven to be true.
>
> ?? Dereferencing a null pointer is always a bug, whether you decided to
> check for it or not.
The point is to make it not possible for you to dereference a null
pointer to begin with.
The compiler won't let you do it without dropping to inline assembly or
doing some unsafe casts.
If the compiler forces a check to occur, and it becomes null after,
there is something very wrong going on. Likely stack corruption. At
which point a segfault is absolutely the right tool for the job!
```d
void func(int* ptr) {
if (ptr !is null) {
writeln(*ptr); // ok pointer is known to be good
writeln(*ptr); // IF ptr is null this needs to segfault!!! STACK
CORRUPTION???
}
writeln(*ptr); // Error: ptr could be null!!! This should not compile
}
```
>> 2. It throws an exception (in D), which can be caught safely. It is
>> only when exception chaining occurs that the state may not have been
>> cleaned up correctly.
>
> Exceptions in D are the same as the ones used for seg faults (except for
> on Win64, where I couldn't figure out how the system exceptions worked).
Unless I can do:
```d
try {
...
} catch (NullPointerError) {
...
}
```
It is not the same system from a "I have to write code to handle it"
stand point.
>> An even better solution to using an assert, is to use an if statement
>> instead.
>>
>> ```d
>> if (int* ptr = var.field) {
>> // success
>> } else {
>> // fail, you can gracefully degrade/log here
>> }
>> ```
>
> The reason exceptions were invented was because such code for every
> pointer dereference made reasonable code look quite ugly.
>
> Of course, you can still write such code if you like.
It can be improved greatly by using ``?.`` operator for long chains.
Yes, exceptions are how application VM languages do this and that's a
good solution for them.
But we can't. We can't setup the signal handler to throw the exception.
We won't use a read barrier.
So we have to force the check upon the user at CT.
>> Which means this:
>>
>> ```d
>> void func(int** ptr) {
>> assert(*ptr !is null);
>>
>> int i = **ptr; // ERROR: `**ptr` is in an unknown type state, it
>> could be null
>> }
>> ```
>
> Which means the language will require you to manually insert assert()s
> everywhere.
Have another look at the example. The assert is useless and that is the
whole point of that example.
You should be using if statements more often than not, AND you should
only need to do this for variables that are loaded from an external source.
```d
int* global;
void func() {
if (int val = *global) {
// ok a check has occured
}
writeln(*global); // Error: no check has occured
assert(global !is null); // Useless, it could change before next
statement (not temporally safe).
writeln(*global); // Error: no check has occured
}
```
Asserts are not as much use as one may think. They can only check
variables in a function body. You have to perform a load into a variable
before you can do the test and a if statement is far better at that.
>> For this reason, bounds checks do not need CT analysis.
>
> Whether it is needed or not, the compiler can't do it.
For what we'd want agreed.
However there is DFA to do this verification, so I thought it was worth
mentioning.
>> They do not have to bring down an application if caught.
>
> Array overflows are fatal programming bugs.
>
> A huge discussion about this raged in the n.g. many years ago. There was
> a camp that maintained that assert failures should be recoverable to the
> point that the program could continue.
>
> The other camp maintained that when a program enters a state
> unanticipated by the programmer, then the program is not recoverable,
> because one no longer has any idea what will happen. The only path
> forward is to stop the program, gracefully or not.
>
> Obviously, I'm in the latter camp. Obviously, one can write programs in
> the first camp (D lets you do whatever you want) but I cannot endorse it.
There is a third camp. This camp is the one that I and a few others are
in. It is used by application VM languages.
It is also the only camp that is practical for long lived applications.
You need to distinguish between "cannot cleanup" and "cannot continue
request".
If you cannot cleanup, then we align, shutdown process. There is no way
to know what state the process is in, or what it could become. It could
infect other processes and with that the user.
A good example of this is the stack corruption above. If you load a
pointer into a variable, and then check to see if it is null and it is
not. But then when you go to dereference it, you find that it is null.
Absolutely segfault out. You have stack corruption.
Same situation with unmapped memory. That should not be possible, crash
time!
On the other hand, a bounds check failing just means your business logic
is probably faulty. You cannot fulfill the request. BUT the data itself
could still be cleaned up, it hasn't been corrupted.
This is where exception chaining comes in, if you attempt cleanup and it
turns out to throw another exception that isn't caught that means
cleanup failed. The program is once again in an unknown state and should
crash.
```d
struct Foo {
~this() {
throw new Exception;
}
}
void callee() {
try {
called();
} catch(Exception) {
// is chained exception, oh noes cleanup failed!!!!
}
}
void called() {
Foo foo;
throw new Exception;
}
```
It is important to note that requests have fixed known entry points
(i.e. they continue execution of a coroutine). This isn't any random
place in a code base. If you threw in the eventloop you'd still crash out.
>> On the other hand, null dereference requires signal handling to catch.
>> If you don't own thread and process you cannot use that to throw an
>> exception. For this reason it's not comparable and requires CT
>> analysis to prevent bringing down the process.
>
> Whoever catches the signal will then stop the buggy, broken program from
> doing more damage. It's kinda the point of a computer with memory
> protection.
Agreed, that is the behavior you want.
HOWEVER if you need that behavior, it better be because the compiler
could not have helped you in anyway to not trigger it.
This should not compile:
```d
func(null);
void func(?nonnull int* ptr) {
int val = *ptr;
}
```
This sort of thing is the norm in application VM languages now. It isn't
a hypothetical improvement.
https://kotlinlang.org/docs/generics.html#definitely-non-nullable-types
https://kotlinlang.org/docs/null-safety.html
https://developer.apple.com/documentation/swift/designating-nullability-in-objective-c-apis#Annotate-Nullability-of-Individual-Declarations
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/attributes/nullable-analysis#postconditions-maybenull-and-notnull
>> On the other hand, if you were to propose read barriers that throw
>> exceptions for null dereference, I would agree to the statement that
>> they are comparable.
>
> That'll turn D into a very poorly performing language. Besides, throwing
> an exception is just what sig faults are. It's the same mechanism.
Agreed, I would never propose it, hence the hypothetical you.
I used the idea for a comparison to bounds checks.
>> Owner: Why did you use D! If you just used <insert mainstream
>> application VM language> we wouldn't have lost millions of dollars!!!
>
> D is often unfairly accused of flaws.
As an issue, this one has had CT analysis in application VM languages
for around 15 years so this one is kinda fair. Especially with D's Java
heritage.
More information about the Digitalmars-d
mailing list