Why use a DFA instead of DIP1000?

Tue Sep 16 10:47:03 UTC 2025

On 16/09/2025 9:59 PM, Dennis wrote:
> On Monday, 15 September 2025 at 23:02:25 UTC, Adam Wilson wrote:
>> Here is examples of C#'s DFA in action:
>>
>> (...)
>>
>> There is no rule that states that DFA must be slow as there exist 
>> examples disproving the assertion.
> 
> When Walter talks about DFA, he refers to the specific algorithm he 
> learned in the 1980s and implemented in his backend optimizer, which 
> solves data-flow equations in a loop until it converges on a fixed 
> point. This algorithm is superlinear, so considered 'too slow' for the 
> frontend which aims to analyse n statements in O(n) time.
> 
> What Rikki implements is a 4th semantic pass, which analyzes the code 
> after the existing 3 linear semantic passes have completed. I don't know 
> exactly how it works, but from his communication I gather it's a subset 
> of 'full' DFA which trades a bit of accuracy for compile speed.

You haven't got my stuff entirely right.

The fast DFA which is what I'm working on right now, runs in semantic 3.

Its very similar to DIP1000 in terms of it doing a single forward pass 
with the statement and expression walkers.

The slow DFA which I want to work on eventually, is indeed more like the 
classical DFA's of the 80's and would run in semantic 4. Since this is a 
known quantity and it isn't the default experience which all these 
features hinge on having a solution to, so it isn't a priority.

Like classical DFA's the fast DFA does indeed use lattices, join and 
meet operators. But I have extra information available like depth 
checks, write counts and scopes modelling that classical DFA's do not have.

The fast DFA does something quite interesting, it isn't modelling what 
is modellable, it instead models what it can't model, and what is left 
is what is modellable. You need to have unknown state, with ways of 
putting variables into it. This allows me to cull as much false and 
perceived false positives as possible.

Data flow analysis is the derivative of proof assistants. They work on 
properties and sometimes in the case of VRP they integrate values into 
their calculations.

If it weren't for my zero false & perceived false positives rule, the 
fast DFA engine would be almost as good as a full DFA engine. Its 
missing a complete interprocedural analysis, and single function 
cyclicity stuff.

> The C# examples of unused symbols are, from my brief experimentation on 
> https://dotnetfiddle.net, even simpler than that. They don't take 
> control flow into account at all:
> 
> ```C#
> public class Program
> {
>      static int y;
> 
>      public static void Main()
>      {
>          int x = 0;
>          if (y < 0)
>              if (y > 0) // this branch is dead code
>                  x++; // but C# doesn't mark this line as unreachable, 
> or x as unused
>      }
> }
> ```

It does actually.

Godbolt has a control flow graph display for C#.

https://csharp.godbolt.org/z/nMjPqYens

```
Program:Main() (FullOpts):
        xor      edi, edi
        tail.jmp [System.Console:WriteLine(int)]
```

It silently culled the branches. Exactly what DFA should do in this 
situation.

> That's not to say it's not useful, I think dmd should compute this 
> information similarly and expose it through an LSP interface. However, 
> this can be implemented in the existing linear semantic3 pass with a 
> simple usage counter/boolean per declaration.
And that would be a rewrite and I don't mean just DIP1000.

Doing this stuff as part of the type system isn't the normal way to 
handle it, and for good reason. Attaching a DFA engine today directly 
into the type system to resolve this is quite frankly too late. From 
what I've seen that would be a bad direction to go in and have no desire 
to watch that being tried.