Why use a DFA instead of DIP1000?
Richard Andrew Cattermole (Rikki)
richard at cattermole.co.nz
Tue Sep 16 14:21:07 UTC 2025
On Tuesday, 16 September 2025 at 08:50:00 UTC, Dennis wrote:
> On Saturday, 13 September 2025 at 16:12:06 UTC, Richard (Rikki)
> Andrew Cattermole wrote:
>> The fact that D is not architectured to be able to stop these
>> false positives is an intended bug in our design. Not all PL's
>> work like this the ML family certainly don't. They tie very
>> advanced analysis engines including DFA into their type system.
>
> So the answer to my question is: this it not motivated by
> reported issues from D users, but by other languages doing
> this. If the only example for D you can give is an
> "unrealistic" one, can you please give a realistic example from
> another language then?
I am not trying to improve D based upon what other languages can
do. They can only be an inspiration. For instance I take note of
how people view other languages and compilers solutions both
historically and in real time.
>> Just because people don't understand that this is possible,
>> doesn't mean it hasn't been an issue. I see it as a case that
>> people don't know what they don't know. So they don't complain.
>> This is a hole that the C family relies upon having due to
>> historical reasons and so hasn't reached common knowledge.
>
> I understand DFA is an interesting research topic, and if your
> claim was that it might lead to interesting discoveries or new
> paradigms I fully agree. But when you say it's necessary to fix
> DIP1000 specifically, I need some evidence to believe it.
> There's been over 100 DIP1000-related bug reports with simple
> non-DFA solutions, and I'm skeptical that people 'don't know'
> to report DFA related issues. Several users reported issues
> about Value Range Propagation (VRP) not working across
> statements. Even when they don't explicitly know DFA, they
> still experience a problem and can intuit a solution being
> available.
In my last post, which was a giant wall of text, I explained how
this isn't the case. Noting templates exhibit some of the
underlying issues that DIP1000 has.
https://forum.dlang.org/post/10abme4$2ekj$1@digitalmars.com
Both me and Walter seem to have failed to convey these failures
in dmd's architecture.
I will be ignoring the lacking features in DIP1000's attribution
capabilities. This is purely engine implementation design stuff.
Here are some code similar to what I've sent Walter over the past
year for what I want to work by default for escape analysis in D.
These are backed up by all the debugging and helping of other
people over the years. This will give the best experience over
all I expect.
```d
extern(C) int printf(const char*, ...);
void myFunc1(scope const char* str) {
printf("Text: %s\n", str); // ok
}
```
This works by acknowledging that variable state isn't "set" and
"not-set" for any given property, its "set", "not-set", and
"unknown" at a minimum.
My fast DFA engine does this, you can see it in code that looks
like:
```d
void aPrototype(ref bool);
void myFunc2() {
bool b;
aPrototype(b);
assert(b); // ok
}
```
The truthiness value went from false, to unknown with the
function call not being attributed.
Escape into a global variable:
```d
int* global;
void myFunc3(scope int* ptr) {
global = ptr; // Error: ptr going into global escapes
}
```
Might not error (depends on how it plays out):
```d
int* global;
struct Foo {
int* field;
void method() {
global = field;
}
}
void myFunc4(scope int* ptr) {
Foo foo;
foo.field = ptr;
foo.method;
}
```
The point of this set of tunings isn't because I want to have
opinions.
The point is to get a subset of escape analysis turned on by
default that the community will find value in. Without hitting
any of the many issues that they have perceptually had with
DIP1000.
To implement this, I did review DIP1000 and concluded that it
would be a rewrite. It simply isn't designed to handle these kind
of properties. At which point you might as well have a DFA for
the escape analysis to live in, and up the ceiling of
capabilities.
At this months monthly meeting I asked if people would find value
in the ``myFunc3`` error, hands did go up. I also raised the
double-duty of ``scope`` wrt. stack allocation.
Fact is people find value in a subset of errors from escape
analysis, but it isn't the same subset that Atila is proposing
with his "no inferred scope variable reports" approach. If I'm
not convincing for any reason, I suggest doing a survey to see
how it plays out. I expect you will get four groups:
1. No protection (@system only), tiny minority
2. False positive heavy, as much protection as possible default,
tiny minority
3. False positive heavy, as much protection as possible opt-in,
small minority
4. A subset of protection (@safe and the above examples), the
majority
These are based upon my previous survey.
If you can do this without a full rewrite of DIP1000 I will be
impressed. However, you will still need to solve borrowing so
that I can get reference counting.
More information about the Digitalmars-d
mailing list