Type state analysis

Fri Mar 29 18:04:23 UTC 2024

On 30/03/2024 6:10 AM, Walter Bright wrote:
> The @live functions do type state analysis - data flow analysis is used 
> to determine if a variable is 'live' or not.
> 
> It is indeed costly to do dfa in the front end, that's one reason why 
> it's restricted to @live functions.
> 
> The dfa could be extended for null checking, but in practice, null 
> checking is not that effective:
> 
> ```
> class C { void xx(); }
> 
> struct S { C c; }
> 
> C mars(C c) { return null; }
> 
> void phobos(ref S s)
> {
>      C c;
>      c.xx();       // detected
>      mars(c).xx(); // needs whole program DFA to detect
>      s.c.xx();     // cannot be detected
> }
> ```
> 
> This is why @live functions won't work without scope, ref, and return 
> annotations on the functions it interfaces with. @live functions, like 
> Rust, also severely limit the kinds of data structures possible.
> 
> Other type state analysis currently done in D is:
> 
> 1. Value Range Propagation
> 
> 2. whether a field is initialized or not in a constructor is tracked

Yes, typical applications of DFA need to be run on the whole program to 
work as far as I'm aware. ML family compilers do this almost 
exclusively, they cannot do multi-step builds. D however is the opposite.

Given that what I'm suggesting is not the norminal use case for DFA 
(quite the opposite), and we already have an approach thanks to DIP1000 
I am proposing to go function by function.

You must annotate the type states input and output for functions 
parameters and return value if the defaults are less than what you are 
wanting guaranteed.

It acts purely in the form of verification against this.
Inferring only happens within the body of a function, it does not affect 
the signature including for templates (not part of type system).

Because of these decisions it can be parallelized without concern 
(unless you need to run semantic on newly inserted statements for cleanup).

See my recently added example that Rust cannot check since it hasn't got 
type state analysis in production.

https://gist.github.com/rikkimax/eed86a7061445a93f214e41fb6445e40

```d
T* makeNull(T)() @safe {
     return null;
}

void useNull() @safe {
     int* var = makeNull!int();
     // var is in type state initialized as per makeNull return state

     *var = 42;
     // segfault due to var being null
}
```

What we want to happen instead:

```d
T* makeNull(T)(/* return'initialized */) @safe {
     return null;
     // type state default is more than the type state initialized
     // so it is accepted
}

void useNull() @safe {
     int* var = makeNull!int();
     // var is in type state initialized as per MakeNull return state

     // perform load via var variable
     // this will error due to initialized is less than the nonnull type 
state
     // Error: Variable var is in type state initialized which could be 
null, cannot write to it
     *var = 42;
}
```

To fix, simply check for null!

```d
void useNull() @safe {
     int* var = makeNull!int();
     // var is in type state initialized as per MakeNull return state

     if (var !is null) {
         // in scope, assume var is in type state nonnull
         *var = 42;
     }
}
```