Null-checked reference types

Mon Aug 12 10:02:33 UTC 2024

On Wednesday, 7 August 2024 at 11:30:02 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
>
> On 07/08/2024 11:22 PM, Quirin Schroll wrote:
>> On Wednesday, 7 August 2024 at 01:39:29 UTC, Richard (Rikki) 
>> Andrew Cattermole wrote:
>>> This allows you to do both loads and stores and do something 
>>> if it failed transitively.
>>>
>>> ```d
>>> if (var1.var2?.var3?.field = 3) {
>>>     // success
>>> } else {
>>>     // failure
>>> }
>>> ```
>> 
>> I somehow don’t like `if (… = …)` when it’s not a declaration. 
>> At first sight, I thought you intended `… == 3`.
>
> It's going to be valid regardless, due to AssignExpression.

Currently, assignments are not valid for conversion to `bool`. 
(``Error: assignment cannot be used as a condition, perhaps `==` 
was meant?``)

>>> > No data flow analysis is proposed. Null checking is local 
>>> > and
>>> done by tracking ? and ! by the type system.
>>>
>>> DFA is only required if you want the type state to change as 
>>> the function is interpreted. So that's fine. That is a me 
>>> thing to figure out.
>> 
>> If I understand correctly, by “type state” you means something 
>> like value range propagation. It basically *is* value range 
>> propagation, however the ranges in question are `null` and all 
>> non-null values. You don’t suggest `typeof` type of a variable 
>> or expression changes, correct? (I think that would be very 
>> weird.)
>
> No, I meant type state.
>
> https://en.wikipedia.org/wiki/Typestate_analysis
>
> unreachable < reachable < initialized < default-initialized < 
> non-null < user

I didn’t read the Wikipedia article in detail, but it contains no 
“null,” so I’m wondering how it’s related. A variable of 
non-nullable type must be initialized. If we’re talking `@system` 
code, fine, it need not be, it could even be void initialized. 
IIUC, typestate analysis could be used to make void 
initialization `@safe` by proving that a void initialized value 
has definitely been initialized whenever it’s read (i.e. no 
uninitialized read).

IIUC, what you’re suggesting is allowing variables of non-null 
type to be initialized by `null`, but that reading one requires 
them to be initialized.

>>> However, you do not need to annotate function body variables 
>>> with this approach.
>>>
>>> Look at the initializer of a function variable declaration, 
>>> it'll tell you if it has the non-null type state.
>>>
>>> ```d
>>> int* ptr1;
>>> int* ptr2 = ptr1;
>>> ```
>> 
>> The only issue is, just because e.g. a pointer is initialized 
>> with something non-null (e.g. the address of a variable), that 
>> doesn’t mean some logic later won’t assign `null` to it.
>
> Right, that would have to be disallowed without DFA, since the 
> type state must not change throughout a function body.

Why wouldn’t it be able to?
It might make sense to the programmer to initialize a variable 
with a definite non-null value, but later, e.g. on some 
error-like case, reassign `null`.

If you use inference, it may (depending on implementation) infer 
a non-nullable type. The right course of action is to use an 
explicit wider type. This is similar to how `auto x = new 
Derived` gives you `x` typed as `Derived`, and that bars you from 
assigning it some other `Base` type object. The right course of 
action is to declare `x` via `Base x = new Derived`.

>>> However the problem which caused me some problems in the past 
>>> is on tracking variables outside of a function. You cannot do 
>>> it.
>>>
>>> Variables outside a function change type state during their 
>>> lifespan. They have the full life cycle, starting at 
>>> reachable, into non-null and then back to reachable. If you 
>>> tried to force it to be non-null, the language would force 
>>> you to have an .init value that is non-null. This is an known 
>>> issue with classes already. It WILL produce logic errors that 
>>> are undetectable.
>> 
>> I don’t care much about tracking. Probably, with `if (auto) 
>> ...`, you can just rename the variable, but typed non-nullable:
>> 
>> ```d
>> void f(int*? p)
>> {
>>      if (int* q = p) ... else return;
>>      int v = *q; // no error, q isn’t nullable, not by 
>> analysis, just by type
>> }
>> ```
>
> What matters here is that you do not need to add annotation to 
> the type itself. It only needs to exist within the function 
> signature. Anywhere else its useless information.

I don’t understand. To me, `Object!` and `Object?` are related 
but different types. You can have arrays of them, etc., how else 
would the information of nullableness be retained?

Maybe I need some info dump on type state analysis and what you 
mean exactly, because as I understand, TSA would only give you an 
implicit cast from `T?` to `T!` in some cases, similar to how 
uniqueness gives you an implicit cast from `T` to `immutable(T)` 
in some cases.