[Not really OT] Crowdstrike Analysis: It was a NULL pointer from the memory unsafe C++ language.

Sun Jul 28 18:21:41 UTC 2024

On 7/28/24 18:12, Don Allen wrote:
> ...
>>
>> Either `@safe` is a feature we want in the language, then it should be 
>> solid, or we'd be better off just removing it. What I am not on board 
>> with is making `@safe` a priority while at the same time abandoning 
>> memory safety in `@safe` code. This makes no sense.
> 
> I'm guessing that your issue with all of this talk about memory safety 
> is that you have no sense of a coherent plan.

It's more that I do have a sense of a coherent plan, but the way Walter 
abuses `@trusted` undercuts that plan as well as all existing 
documentation and articles.

> If I'm right, then I share 
> your concern. This is not intended as a criticism of Walter, because I 
> see defining and enforcing memory safety in D as a very hard problem, so 
> it doesn't surprise me that a clear plan isn't apparent.
> ...

Well, I think it's pretty clear what the story is:

- `@safe` means memory safe.
- `@system` means potentially not memory safe.
- `@trusted` means memory safe, but the compiler is not asked to check it.

`@safe` is unfinished, and still leaves open some holes. DIP1000 
addresses most of these, but is also still work in progress.

The main reason why `@safe` is hard is that quite a few people, most 
importantly Walter, want it to be expressive. DIP1000 is already harder 
than `@safe` needs to be. The simplest possible definition of `@safe` 
is: just use the GC. This was Robert's point at last years DConf: "Why 
are you even taking references to stack memory in @safe?"

`@safe` D is not hard because it has to be, it is hard because of 
concerns about expressiveness.

An issue with `@safe`/`@system`/`@trusted` is that it is not 
fine-grained enough and does not give the programmer enough control over 
safety checks. This is one of the reasons why people are abusing `@trusted`.

> I think the difficulty is that D has its roots in C and C++, both 
> notably memory-unsafe. D adds the garbage collector to the mix of stack- 
> and manually-heap-allocated memory. Trying to provide compiler-enforced 
> guarantees in the face of these disparate options is not a simple 
> matter. Reading the "Function Safety" section of the language reference 
> made my head spin. I think its complexity flows directly from the 
> memory-management options D provides.
> ...

Well, it is possible to simply disallow some of those options in `@safe` 
code.

> Rust avoids this problem by providing a single memory-management 
> methodology that is imposed on the user. There are no choices. You do it 
> the Rust way or your code won't compile.
> ...

I mean, not really. You can manipulate raw pointers to stack-allocated 
memory in Rust too, it just will not be safe.
https://doc.rust-lang.org/std/ptr/index.html
https://doc.rust-lang.org/reference/unsafe-keyword.html

It's quite similar really:

- `@safe` corresponds to a Rust function that is not unsafe and has no 
unsafe blocks in it.
- `@trusted` corresponds to a Rust function that is not unsafe and has a 
single big unsafe block in it.
- `@system` corresponds to a Rust function that is unsafe and has a 
single big unsafe block in it.

The only question is how the type system and language semantics are 
defined, which will influence what kind of code can be written in the 
safe subset. Rust dedicates a bit more type system real estate to static 
lifetime analysis, D's DIP1000 is more restrictive.

I.e., there is nothing magic about Rust that is not already in D in 
terms of how memory safety works at the very basic level. This includes 
unsoundness bugs in the type checker.

> I personally like that D offers the choices it does and in my own code, 
> I avoid malloc/free and use of pointers as much as possible. I 
> stack-allocate when feasible and GC-allocate when necessary. Talking to 
> C code complicates things, something that bit me when I was first 
> learning D (I passed a GC-allocated string to sqlite as a binding and 
> failed to read the warning in the toStringz documentation about the 
> string vaporizing in a subsequent garbage collection).
> 
> My own preference would be to first focus on improving the documentation 
> of what is already in the language. What types get stack-allocated? 
> GC-allocated? It's not always clear in the current documentation, e.g., 
> I believe static arrays are stack-allocated; where does it say that?

They are allocated wherever you put them, because they are value types:
https://dlang.org/spec/arrays.html#static-arrays

Documentation can always be improved, but the distinction between value 
types and reference types I think is made pretty clear by the 
documentation on structs and classes.

> There are many other examples of this information not present in the 
> documentation. The programmer needs to know this information to avoid 
> unsafe memory-management practices, because what is required of the 
> programmer depends upon the type of memory in question.
> ...

Local variables are usually stack-allocated, except when there's a 
closure, then they are allocated on the heap. Explicit allocations go 
wherever the allocator puts them.

> I think one of the great features of D is the ability of D and C code to 
> talk directly without the need for an intermediate interface. The 
> current documentation is a good start, but could be improved with more 
> explanation and examples of how to this safely.
> 
> I think improved documentation would help to make people happier with D.
> I don't think D needs to be compiler-guaranteed memory safe, which is 
> good because I don't think it's possible.

Compiler guarantees are not really the issue at hand at the moment. The 
issue is that Walter advocated for marking functions memory safe that 
are not memory safe. It's the ultimate middle finger to people who care 
about documentation, especially documentation that is embedded into the 
function signature.

> There may be opportunities for 
> the compiler to provide *some* help in this area that it presently 
> doesn't, but I doubt that D will ever be able to make the assertions 
> about memory-safety that Rust does.

Rust will never be able to make the assertions about memory safety that 
people seem to think Rust makes about memory safety.

Anyway, D it already makes the assertion that `@safe` means memory safe, 
and it is in much better shape than Rust a priori in terms of memory 
safety because of the garbage collector.

It is quite annoying to me that people just go "memory safe"? That must 
mean like Rust. Nope. Why does nobody ever bring up Java?

> Note that Zig provides only stack- 
> and manual heap-allocation. It is not a memory-safe language. But 
> there's a lot of interest in it, despite not being close to release and 
> a growing issue list.

I think they are doing some interesting things, but it is not for me.