Type System Analysis Algorithms

Sun Nov 30 07:46:58 UTC 2025

This post may come across as a bit of out of left field and may 
not appear to be all that relevant to everyone in the D 
community, but I would suggest that everyone should read it and 
try to fully understand what it is talking about. It deals with 
how D is analysed in a very fundamental way and why it doesn't 
always work the way one would appreciate.

## Structural Definition

To start off with, let's define what a type system analysis 
algorithm is, or TSAA  for short. A TSAA is an algorithm that 
dictates the ordering of symbol analysis and what to do on 
cyclicity.

An example of a TSAA is a simple foreach loop over the members 
defined in source order.

```d
void caller() {
	callee();
}

void callee() {
}
```

The ``caller`` function will be analysed before ``callee``. This 
does not limit us if we can construct the information 
progressively, based on multiple passes. Let's call this pass 2 
for constructing the function declaration information and pass 3 
for validating the function bodies.

But it does run into problem cases, namely being order dependent, 
cyclicity and if you want templates, they have to be greedy in 
their analysis.

This is the algorithm dmd uses for analysis. The static analysis 
tools DIP1000, @live and fast DFA all live within the pass 3, and 
do not always have access to the information required to operate 
with a complete declaration. To demonstrate this:

```d
void caller() @safe {
	scope int* ptr = new int;
	callee(ptr); // Error: assigning scope variable `ptr` to 
non-scope parameter `ptr` calling `callee` is not allowed in a 
`@safe` function
}

void callee(int* ptr) @safe {
	int val = *ptr;
}
```

Thanks to the TSAA, this example will _never_ compile with 
DIP1000 enabled even if inferration was working for the parameter 
of ``callee``.

There is a known workaround to this; however, it has the 
unfortunate side effect that now cyclic function calls will no 
longer compile. A rather massive sacrifice that should not be 
acceptable to anyone.

This workaround requires that each function call be greedy, so 
it'll force the analysis to complete, and when that fails, just 
like how templates error with cycles, except this is far worse.

As an outcome, attributes are not inferred for non-templated 
functions; the benefit of doing so would put the compiler into an 
inconsistent state with confusing conditional errors.

The name for this algorithm, "structural definition" can be found 
in [Principles of Abstract 
Interpretation](https://www.amazon.com/gp/product/0262044900) 
page 181.

## Chaotic Iterations

A chaotic iteration algorithm is an alternative to structural 
definitions algorithm for TSAA, which encompasses the worklist 
algorithms of traditional data flow analysis. It can be found on 
page 253 of Principles of Abstract Interpretation.

The quality that makes it chaotic iteration, is its an iterative 
algorithm that does not evolve every aspect of a program 
simultaneously. Some get left out as it performs the analysis. 
This works with interprocedural analysis and can solve the cyclic 
problem.

For a worklist algorithm, it operates on blocks of abstract 
instructions (an Intermediete Representation aka IR), and solves 
their equations until they are all complete, with each iteration 
evolving the knowledge of that block. This is used for @live, but 
only within it. Its presence does not aid in interprocedural 
analysis, which is affected by the limited pass 2 and pass 3 
structural definition analysis.

A lecture on chaotic iterations being applied to the worklist 
algorithm approach may be found in LNCS volume 1256 [From chaotic 
iteration to constraint 
propagation](https://link.springer.com/chapter/10.1007/3-540-63165-8_163)  with an application of propagating behaviour from dependent blocks to later ones.

This algorith,m put differently, can be found in other books such 
as [Advanced Compiler Design and 
Implementation](https://www.amazon.com/Advanced-Compiler-Design-Implementation-Muchnick/dp/1558603204) page 231 and [Compilers: Principles, Techniques, and Tools](https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811) page 624 (may not be for this version linked).

Improvements to the overall analysis algorithm exist for a given 
metric. An example of this is [Memory-Efficient Fixpoint 
Computation](https://arxiv.org/abs/2009.05865) which uses the 
iteration strategy from Bourdoncle [Efficient chaotic iteration 
strategies with 
widenings](https://link.springer.com/chapter/10.1007/BFb0039704). 
By constructing a weak topological map it is possible to use 
memory more efficiently during analysis.

As the standard algorithm to abstract interpretation iteration 
strategy, Bourdoncle's algorithm has been studied for 
improvements; one such improvement saw 1-3x the performance with 
a lot of the results centred on 3 times. This was done by going 
concurrent. [Deterministic Parallel Fixpoint 
Computation](https://arxiv.org/abs/1909.05951).

One of the first algorithms described for abstract interpretation 
in the form of compiler optimisations may be found in the paper 
[A unified approach to global program 
optimisation](https://dl.acm.org/doi/10.1145/512927.512945). This 
algorithm is a chaotic iteration algorithm and the precursor to 
the current work list algorithms.

Fundamentally, what a chaotic iteration does is it takes a given 
output state of a set of statements, expressions, or symbols, 
then feeds the lattices back into the inputs of (another) 
statements, expressions, or symbols. It uses multiple passes to 
try to reach a fixed point where changes are no longer taking 
place, resulting in a convergence of state.

It is the most accurate and can be implemented outside of pass 2 
and pass 3. It may be implemented symbolically as a pass 4 in the 
compiler; this is notable due to pass 4 not being dependent upon 
the order of the earlier passes. As an example, consider:

```d
foreach(i; 0 .. 5) {
	assert(i < 4);
}
```

By feeding forward the value of ``i ∈ { 0 .. 5 }``  it is 
possible to test the assert for each value and see that when ``i 
∈ { 4 }`` it will fail.

There are, of course, other TSAA's that may be used in place of 
these more well-understood and accepted solutions. An example of 
this is [Type Universes as Kripke 
Worlds](https://dl.acm.org/doi/10.1145/3747532). The multiple 
pass design is a simple tree walk, but unlike the other chaotic 
iteration approaches, it uses numbering on types to determine 
what variables to leave out and produce the state to finally 
converge upon iteratively. It too can handle cyclicity, but 
unlike the other solutions has a known fixed cost for a given set 
of function,s which makes it fairly unique.

## Structural Definition Fast DFA vs DIP1000

Both the fast DFA engine and DIP1000 operate as a single (with 
some exceptions) forward walk over the AST as part of pass 3.

Apart from the architectural differences (which are significant), 
there is a simple but important tuning change; what they do with 
incomplete information.

As mentioned in Structural Definition, but not defined, 
incomplete information is when a function has not completed pass 
3, but another that references it is in pass 3 and needs to know 
about the former. If both have started pass 3 but neither has 
completed, this is a cyclic error.

Pass 3 performs inference to acquire annotations for ``scope`` 
``return``, and ``return ref``. The incomplete information 
representation represents whatever the user wrote.

So in the case:

```d
void caller() @safe {
	scope int* ptr = new int;
	callee(ptr); // Error: assigning scope variable `ptr` to 
non-scope parameter `ptr` calling `callee` is not allowed in a 
`@safe` function
}

void callee(int* ptr) @safe {
	int val = *ptr;
}
```

The function ``caller`` will see ``callee`` exactly as it is 
written, as far as DIP1000 is concerned, and will hence produce 
the error when calling it. Even though the parameter should be 
inferred as ``scope``, and will be if the function is templated.

Due to the analysis order supplied by structural definitions, 
simply turning on inference of all functions would not be 
beneficial and instead would make the errors very hard to 
understand why they are conditional.

Alternatively, this the fast DFA engine models the unknown state 
for a given property, and when seen, will not result in an error 
being emitted. For instance:

```d
void caller() {
	int* ptr = null;
	callee(ptr);
}

void callee(int* ptr) {
	int val = *ptr;
}
```

Will not error, because at the time, there is no information 
about ``callee`` when ``caller`` was analysed. But if we 
annotated it explicitly (not implemented yet):

```d
void caller() {
	int* ptr = null;
	callee(ptr);
}

void callee(?nonnull int* ptr) {
	int val = *ptr;
}
```

Now ``caller`` will have access to the user written annotation 
for non-null and error out as it is not dependent upon pass 3 for 
existing.

This makes the fast DFA engine progressive; the more information 
you give it, the more it can do. Otherwise, it will just try its 
best to catch the really stupid stuff.

However, the end result for DIP1000 is that it operates on the 
assumption that it always has complete information and errors 
quite happily even when it has incomplete information. This is a 
fundamentally flawed design for the implementation without a fix.

## Chaotic Iterations Slow DFA vs @live

As of current, the slow DFA has not been implemented, and exact 
implementation details have not been decided upon, nor an exact 
feature set, so this heading is going to be a lot smaller than 
the others.

First, to understand @live is that yes, it is implemented using 
chaotic iteration, but it is implemented in pass 3, making it, 
for all intents and purposes structural definition. This alone 
isn't its fundamental flaw. Its fundamental flaw is that it 
depends upon DIP1000, which is itself a structural definition and 
with that fundamentally flawed solution that hasn't got a fix.

It is important to note that the current @live implementation and 
exact details that depend upon DIP1000, there is nothing tying it 
to pass 3 or DIP1000 as its basis for annotations. So it may be 
fixable with some tweaks and minor reworkings, should another 
escape analysis solution be available.

As for the slow DFA, all current work by me is aligning with it, 
living in pass 4. I am aware that Walter has some ideas in this 
direction also.

In pass 4, a TSAA that is of chaotic iteration variety would be 
used for determining the relationships between statements, 
expressions, and symbols. It would support full interprocedural 
analysis. A word of warning about such a static analysis tool: 
it's going to create massive graphs of equations. The kind that 
fills books for complex functions. It will be slow, memory-heavy 
and will only work on complete information. Due to this, it will 
likely error when indirection is involved rather than putting 
properties into the unknown state. This makes it undesirable as 
an on-by-default solution, but it will find many people 
interested in its strong guarantees.