Why use a DFA instead of DIP1000?

Tue Sep 16 14:21:07 UTC 2025

On Tuesday, 16 September 2025 at 08:50:00 UTC, Dennis wrote:
> On Saturday, 13 September 2025 at 16:12:06 UTC, Richard (Rikki) 
> Andrew Cattermole wrote:
>> The fact that D is not architectured to be able to stop these 
>> false positives is an intended bug in our design. Not all PL's 
>> work like this the ML family certainly don't. They tie very 
>> advanced analysis engines including DFA into their type system.
>
> So the answer to my question is: this it not motivated by 
> reported issues from D users, but by other languages doing 
> this. If the only example for D you can give is an 
> "unrealistic" one, can you please give a realistic example from 
> another language then?

I am not trying to improve D based upon what other languages can 
do. They can only be an inspiration. For instance I take note of 
how people view other languages and compilers solutions both 
historically and in real time.

>> Just because people don't understand that this is possible, 
>> doesn't mean it hasn't been an issue. I see it as a case that 
>> people don't know what they don't know. So they don't complain.
>> This is a hole that the C family relies upon having due to 
>> historical reasons and so hasn't reached common knowledge.
>
> I understand DFA is an interesting research topic, and if your 
> claim was that it might lead to interesting discoveries or new 
> paradigms I fully agree. But when you say it's necessary to fix 
> DIP1000 specifically, I need some evidence to believe it. 
> There's been over 100 DIP1000-related bug reports with simple 
> non-DFA solutions, and I'm skeptical that people 'don't know' 
> to report DFA related issues. Several users reported issues 
> about Value Range Propagation (VRP) not working across 
> statements. Even when they don't explicitly know DFA, they 
> still experience a problem and can intuit a solution being 
> available.

In my last post, which was a giant wall of text, I explained how 
this isn't the case. Noting templates exhibit some of the 
underlying issues that DIP1000 has. 
https://forum.dlang.org/post/10abme4$2ekj$1@digitalmars.com

Both me and Walter seem to have failed to convey these failures 
in dmd's architecture.

I will be ignoring the lacking features in DIP1000's attribution 
capabilities. This is purely engine implementation design stuff.

Here are some code similar to what I've sent Walter over the past 
year for what I want to work by default for escape analysis in D. 
These are backed up by all the debugging and helping of other 
people over the years. This will give the best experience over 
all I expect.

```d
extern(C) int printf(const char*, ...);

void myFunc1(scope const char* str) {
     printf("Text: %s\n", str); // ok
}
```

This works by acknowledging that variable state isn't "set" and 
"not-set" for any given property, its "set", "not-set", and 
"unknown" at a minimum.

My fast DFA engine does this, you can see it in code that looks 
like:

```d
void aPrototype(ref bool);

void myFunc2() {
     bool b;
     aPrototype(b);
     assert(b); // ok
}
```

The truthiness value went from false, to unknown with the 
function call not being attributed.

Escape into a global variable:

```d
int* global;

void myFunc3(scope int* ptr) {
     global = ptr; // Error: ptr going into global escapes
}
```

Might not error (depends on how it plays out):

```d
int* global;

struct Foo {
     int* field;

     void method() {
     	global = field;
     }
}

void myFunc4(scope int* ptr) {
     Foo foo;
     foo.field = ptr;
     foo.method;
}
```

The point of this set of tunings isn't because I want to have 
opinions.
The point is to get a subset of escape analysis turned on by 
default that the community will find value in. Without hitting 
any of the many issues that they have perceptually had with 
DIP1000.

To implement this, I did review DIP1000 and concluded that it 
would be a rewrite. It simply isn't designed to handle these kind 
of properties. At which point you might as well have a DFA for 
the escape analysis to live in, and up the ceiling of 
capabilities.

At this months monthly meeting I asked if people would find value 
in the ``myFunc3`` error, hands did go up. I also raised the 
double-duty of ``scope`` wrt. stack allocation.

Fact is people find value in a subset of errors from escape 
analysis, but it isn't the same subset that Atila is proposing 
with his "no inferred scope variable reports" approach. If I'm 
not convincing for any reason, I suggest doing a survey to see 
how it plays out. I expect you will get four groups:

1. No protection (@system only), tiny minority
2. False positive heavy, as much protection as possible default, 
tiny minority
3. False positive heavy, as much protection as possible opt-in, 
small minority
4. A subset of protection (@safe and the above examples), the 
majority

These are based upon my previous survey.

If you can do this without a full rewrite of DIP1000 I will be 
impressed. However, you will still need to solve borrowing so 
that I can get reference counting.