Why we should not enable a slow DFA by default

Fri Dec 19 06:38:21 UTC 2025

Today I had a question: what if I made the fast DFA engine 
support multiple passes to see if additional executions would 
catch more interprocedural issues? What would the cost be in 
doing so?

For reference please see my last months post on [Type System 
Analysis 
Algorithms](https://forum.dlang.org/thread/waerqsesrohtokjctnyj@forum.dlang.org), for definitions.

The fast DFA engine is a single forward pass, more commonly known 
as a structural definition algorithm. This makes it O(1) in terms 
of cost to your code.

## Methodology

The codebase I was running on is my big 100k LOC that I have been 
using to verify the fast DFA engine against, and normally builds 
in 4.2-4.3s when dmd is built with ldc with optimisations. It has 
25643 functions that the fast DFA is allowed to analyse.

The multi-pass call occurred after all semantic analysis phases 
had been completed. This approach has been referred to by me as 
"semantic 4".

## Results

Firstly, the results of the test, only one additional pass made 
any difference; after that, it completed without any new 
information being able to be taken into account.

Not all functions in the codebase benefit from the multi-pass 
design; 19048 were able to be fully analysed in the original 
execution during semantic 3.

The first pass took 156ms to complete 6595 functions.

The second pass took 30ms to complete 1103 functions.

After the second pass, the iterator stops due to no work being 
completed.

So 17% of the functions in the second pass as first, and the 
predicted time with this percentage is around 27ms.

No reports were generated in any execution of the fast DFA engine 
for this code base.

No reports were fixed during the usage of the multi-pass design.

### Compared to a slow DFA

Slower DFA's such as those that use chaotic iteration for their 
TSAA, are known to be exponential in cost.

This is due to a foundational principle of feeding outputs back 
into inputs, such as with loops until they converge into a fixed 
point that doesn't change.

For this reason, we can do a very guestimate number of how long 
it could take for 6595 functions.

Assuming that cost is ``O(n²)`` per function, that 156ms would be 
2.4 seconds.

In practice, we should only be considering a loop ``O(n²)``; 
however, for each nested loop, we should be adding a ``+ 1`` to 
that exponent. So as guestimates go, it isn't going to be too far 
off.

## Conclusion

At this point, I am unconvinced that adding a second pass in the 
form of semantic 4 would be worth my time implementing. It may 
become necessary to resolve a borrow checker that would need much 
stronger guarantees than escape analysis.

What this post kinda shows, is why I am working on the fast DFA 
engine, rather than cracking open [Principles of Abstract 
Interpretation](https://www.amazon.com/gp/product/0262044900) and 
going to town to get us a proven sound static analyser. Even once 
it is built, there will be a lot of times when it won't be 
desirable. Making it opt-in would be a better default.