Type System Analysis Algorithms
Richard Andrew Cattermole (Rikki)
richard at cattermole.co.nz
Sun Nov 30 07:46:58 UTC 2025
This post may come across as a bit of out of left field and may
not appear to be all that relevant to everyone in the D
community, but I would suggest that everyone should read it and
try to fully understand what it is talking about. It deals with
how D is analysed in a very fundamental way and why it doesn't
always work the way one would appreciate.
## Structural Definition
To start off with, let's define what a type system analysis
algorithm is, or TSAA for short. A TSAA is an algorithm that
dictates the ordering of symbol analysis and what to do on
cyclicity.
An example of a TSAA is a simple foreach loop over the members
defined in source order.
```d
void caller() {
callee();
}
void callee() {
}
```
The ``caller`` function will be analysed before ``callee``. This
does not limit us if we can construct the information
progressively, based on multiple passes. Let's call this pass 2
for constructing the function declaration information and pass 3
for validating the function bodies.
But it does run into problem cases, namely being order dependent,
cyclicity and if you want templates, they have to be greedy in
their analysis.
This is the algorithm dmd uses for analysis. The static analysis
tools DIP1000, @live and fast DFA all live within the pass 3, and
do not always have access to the information required to operate
with a complete declaration. To demonstrate this:
```d
void caller() @safe {
scope int* ptr = new int;
callee(ptr); // Error: assigning scope variable `ptr` to
non-scope parameter `ptr` calling `callee` is not allowed in a
`@safe` function
}
void callee(int* ptr) @safe {
int val = *ptr;
}
```
Thanks to the TSAA, this example will _never_ compile with
DIP1000 enabled even if inferration was working for the parameter
of ``callee``.
There is a known workaround to this; however, it has the
unfortunate side effect that now cyclic function calls will no
longer compile. A rather massive sacrifice that should not be
acceptable to anyone.
This workaround requires that each function call be greedy, so
it'll force the analysis to complete, and when that fails, just
like how templates error with cycles, except this is far worse.
As an outcome, attributes are not inferred for non-templated
functions; the benefit of doing so would put the compiler into an
inconsistent state with confusing conditional errors.
The name for this algorithm, "structural definition" can be found
in [Principles of Abstract
Interpretation](https://www.amazon.com/gp/product/0262044900)
page 181.
## Chaotic Iterations
A chaotic iteration algorithm is an alternative to structural
definitions algorithm for TSAA, which encompasses the worklist
algorithms of traditional data flow analysis. It can be found on
page 253 of Principles of Abstract Interpretation.
The quality that makes it chaotic iteration, is its an iterative
algorithm that does not evolve every aspect of a program
simultaneously. Some get left out as it performs the analysis.
This works with interprocedural analysis and can solve the cyclic
problem.
For a worklist algorithm, it operates on blocks of abstract
instructions (an Intermediete Representation aka IR), and solves
their equations until they are all complete, with each iteration
evolving the knowledge of that block. This is used for @live, but
only within it. Its presence does not aid in interprocedural
analysis, which is affected by the limited pass 2 and pass 3
structural definition analysis.
A lecture on chaotic iterations being applied to the worklist
algorithm approach may be found in LNCS volume 1256 [From chaotic
iteration to constraint
propagation](https://link.springer.com/chapter/10.1007/3-540-63165-8_163) with an application of propagating behaviour from dependent blocks to later ones.
This algorith,m put differently, can be found in other books such
as [Advanced Compiler Design and
Implementation](https://www.amazon.com/Advanced-Compiler-Design-Implementation-Muchnick/dp/1558603204) page 231 and [Compilers: Principles, Techniques, and Tools](https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811) page 624 (may not be for this version linked).
Improvements to the overall analysis algorithm exist for a given
metric. An example of this is [Memory-Efficient Fixpoint
Computation](https://arxiv.org/abs/2009.05865) which uses the
iteration strategy from Bourdoncle [Efficient chaotic iteration
strategies with
widenings](https://link.springer.com/chapter/10.1007/BFb0039704).
By constructing a weak topological map it is possible to use
memory more efficiently during analysis.
As the standard algorithm to abstract interpretation iteration
strategy, Bourdoncle's algorithm has been studied for
improvements; one such improvement saw 1-3x the performance with
a lot of the results centred on 3 times. This was done by going
concurrent. [Deterministic Parallel Fixpoint
Computation](https://arxiv.org/abs/1909.05951).
One of the first algorithms described for abstract interpretation
in the form of compiler optimisations may be found in the paper
[A unified approach to global program
optimisation](https://dl.acm.org/doi/10.1145/512927.512945). This
algorithm is a chaotic iteration algorithm and the precursor to
the current work list algorithms.
Fundamentally, what a chaotic iteration does is it takes a given
output state of a set of statements, expressions, or symbols,
then feeds the lattices back into the inputs of (another)
statements, expressions, or symbols. It uses multiple passes to
try to reach a fixed point where changes are no longer taking
place, resulting in a convergence of state.
It is the most accurate and can be implemented outside of pass 2
and pass 3. It may be implemented symbolically as a pass 4 in the
compiler; this is notable due to pass 4 not being dependent upon
the order of the earlier passes. As an example, consider:
```d
foreach(i; 0 .. 5) {
assert(i < 4);
}
```
By feeding forward the value of ``i ∈ { 0 .. 5 }`` it is
possible to test the assert for each value and see that when ``i
∈ { 4 }`` it will fail.
There are, of course, other TSAA's that may be used in place of
these more well-understood and accepted solutions. An example of
this is [Type Universes as Kripke
Worlds](https://dl.acm.org/doi/10.1145/3747532). The multiple
pass design is a simple tree walk, but unlike the other chaotic
iteration approaches, it uses numbering on types to determine
what variables to leave out and produce the state to finally
converge upon iteratively. It too can handle cyclicity, but
unlike the other solutions has a known fixed cost for a given set
of function,s which makes it fairly unique.
## Structural Definition Fast DFA vs DIP1000
Both the fast DFA engine and DIP1000 operate as a single (with
some exceptions) forward walk over the AST as part of pass 3.
Apart from the architectural differences (which are significant),
there is a simple but important tuning change; what they do with
incomplete information.
As mentioned in Structural Definition, but not defined,
incomplete information is when a function has not completed pass
3, but another that references it is in pass 3 and needs to know
about the former. If both have started pass 3 but neither has
completed, this is a cyclic error.
Pass 3 performs inference to acquire annotations for ``scope``
``return``, and ``return ref``. The incomplete information
representation represents whatever the user wrote.
So in the case:
```d
void caller() @safe {
scope int* ptr = new int;
callee(ptr); // Error: assigning scope variable `ptr` to
non-scope parameter `ptr` calling `callee` is not allowed in a
`@safe` function
}
void callee(int* ptr) @safe {
int val = *ptr;
}
```
The function ``caller`` will see ``callee`` exactly as it is
written, as far as DIP1000 is concerned, and will hence produce
the error when calling it. Even though the parameter should be
inferred as ``scope``, and will be if the function is templated.
Due to the analysis order supplied by structural definitions,
simply turning on inference of all functions would not be
beneficial and instead would make the errors very hard to
understand why they are conditional.
Alternatively, this the fast DFA engine models the unknown state
for a given property, and when seen, will not result in an error
being emitted. For instance:
```d
void caller() {
int* ptr = null;
callee(ptr);
}
void callee(int* ptr) {
int val = *ptr;
}
```
Will not error, because at the time, there is no information
about ``callee`` when ``caller`` was analysed. But if we
annotated it explicitly (not implemented yet):
```d
void caller() {
int* ptr = null;
callee(ptr);
}
void callee(?nonnull int* ptr) {
int val = *ptr;
}
```
Now ``caller`` will have access to the user written annotation
for non-null and error out as it is not dependent upon pass 3 for
existing.
This makes the fast DFA engine progressive; the more information
you give it, the more it can do. Otherwise, it will just try its
best to catch the really stupid stuff.
However, the end result for DIP1000 is that it operates on the
assumption that it always has complete information and errors
quite happily even when it has incomplete information. This is a
fundamentally flawed design for the implementation without a fix.
## Chaotic Iterations Slow DFA vs @live
As of current, the slow DFA has not been implemented, and exact
implementation details have not been decided upon, nor an exact
feature set, so this heading is going to be a lot smaller than
the others.
First, to understand @live is that yes, it is implemented using
chaotic iteration, but it is implemented in pass 3, making it,
for all intents and purposes structural definition. This alone
isn't its fundamental flaw. Its fundamental flaw is that it
depends upon DIP1000, which is itself a structural definition and
with that fundamentally flawed solution that hasn't got a fix.
It is important to note that the current @live implementation and
exact details that depend upon DIP1000, there is nothing tying it
to pass 3 or DIP1000 as its basis for annotations. So it may be
fixable with some tweaks and minor reworkings, should another
escape analysis solution be available.
As for the slow DFA, all current work by me is aligning with it,
living in pass 4. I am aware that Walter has some ideas in this
direction also.
In pass 4, a TSAA that is of chaotic iteration variety would be
used for determining the relationships between statements,
expressions, and symbols. It would support full interprocedural
analysis. A word of warning about such a static analysis tool:
it's going to create massive graphs of equations. The kind that
fills books for complex functions. It will be slow, memory-heavy
and will only work on complete information. Due to this, it will
likely error when indirection is involved rather than putting
properties into the unknown state. This makes it undesirable as
an on-by-default solution, but it will find many people
interested in its strong guarantees.
More information about the Digitalmars-d
mailing list