Memory safe in D

Petar Petar
Wed Mar 13 16:56:45 UTC 2024


On Wednesday, 13 March 2024 at 06:05:35 UTC, Walter Bright wrote:
> [..]
> Consider the following:
> ```
> class A { void bar(); }
>
> void foo(int i) {
>     A a;
>     if (i) a = new A();
>     ...
>     if (i) a.bar();
> }
> ```
> What happens if we apply data flow analysis to determine the 
> state of `a` when it calls `bar()`? It will determine that `a` 
> has the possible values (`null`, new A()`). Hence, it will give 
> an error that `a` is possibly null at that point.

Here's how this situation is handled in TypeScript:

```ts
class A { bar() {} }

function foo(i: number) {
     let a: A;
     if (i) a = new A();

     if (i) a.bar(); // Error: Variable 'a' is used before being 
assigned.
}

function foo2(i: number) {
     let a: A | null = null;
     if (i) a = new A();

     if (i) a.bar(); // Error: 'a' is possibly 'null'
}


function bar(i: number) {
     let a: A;
     if (i) {
       a = new A();
       a.bar(); // No errors.
     }
}

function bar2(i: number) {
     let a: A | null = null;
     if (i) {
       a = new A(); // The type of `a` is `A | null`
       a.bar();     // The type of `a` is now `A`
     }
}
```

> Yet the code is correct, not buggy.
> 
> Yes, the compiler could figure out that `i` is the same, but 
> the conditions can be more complex such that the compiler 
> cannot figure it out (the halting problem).
>
> So that doesn't work.

I agree, however in my experience (I've been using TypeScript 
professionally since ~2019) it's not a problem for the developer 
to rewrite the code in a way that the compiler can understand. In 
this case - rewriting `foo` to `bar`. While your example was 
intentionally simple, in practice, restructuring the code so the 
compiler can understand it, often makes it more clear for the 
humans behind the screen as well.

> We could lower `a.bar()` to `NullCheck(a).bar()` which throws 
> an exception if `a` is null. But what have we gained there? 
> Nothing. The program still aborts with an exception, just like 
> if the hardware checked. Except we've got this manual check 
> that costs extra code and CPU time.

I agree that simply letting the OS handle the segfault is 
sufficient for 98% of the use cases. For the other 2% (say 
writing code for kernels-mode or micro controllers without MMU), 
having a compiler flag to enable rewriting `a.bar()` to 
`assert(a), a.bar()` would be nice.

> BTW, doing data flow analysis is very expensive in terms of 
> compiler run time. The optimizer does it, but running the 
> optimizer is optional for that reason.

C# uses control-flow analysis for definite assignment since its 
early days (I'm not sure if it was part of the first release, or 
if was added later). In my experience, C# has always been one of 
the faster languages in terms of compiler time.

I'd be very interested to hear what you have to say about their 
[language specification][1] on definite assignment:

That said, TypeScript takes this (colloquially known as [flow 
typing][2]) much further: 
https://www.typescriptlang.org/docs/handbook/2/narrowing.html.
It plays extremely pleasingly with their [union types][3].

P.S. please disregard my previous message. I clicked "Send" by 
mistake.

[1]: 
https://github.com/dotnet/csharpstandard/blob/draft-v9/standard/variables.md#94-definite-assignment
[2]: https://en.wikipedia.org/wiki/Flow-sensitive_typing
[3]: 
https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#union-types


More information about the Digitalmars-d mailing list