C is Brittle D is Plastic
H. S. Teoh
hsteoh at qfbox.info
Wed Apr 8 20:34:10 UTC 2026
On Wed, Apr 08, 2026 at 01:01:01PM -0700, Walter Bright via Digitalmars-d wrote:
> On 4/7/2026 4:52 PM, Richard (Rikki) Andrew Cattermole wrote:
[...]
> > 2. This is solved, without the use of meet operations. Basically the
> > entire state context gets duplicated on if statement completion, one
> > for each branch, and everything after gets evaluated with both
> > contexts. This appears to be why GCC will only error on one branch
> > with the static analyzer errors. This has a research paper on it,
> > and I was able to come up with it on my own too. I haven't done it
> > in the fast DFA engine because its not exactly fast.
>
> The program's flow depends on its inputs, and a static analyzer
> doesn't know what they are.
[...]
I've dabbled a bit in algorithms of this kind. Generally, you don't
work with concrete values, because then you'll run into performance
issues and the good ole halting problem. Instead, you work at a
slightly more abstract level, such as with value sets (a generalization
of VRP actually). You assume that parameters can take on any value in
their type (and perhaps a fancy analyser will reduce this initial set
according to in-contracts -- but that's only possible for simple
contracts; with D's contracts a halting problem can potentially lurk
here; if the contract is too complex the algorithm can just take the
worst case of taking the full set of values for that type). Then you
iterate over the statements and eliminate portions of this set, e.g.
when the parameter is assigned to, etc., and propagate the set over
assignments.
You also don't do actual flow, because that can be arbitrarily complex
and lead to the halting problem; instead, you do a simplified flow: when
you encounter an if-statement, you step over both branches and compute
the value set for both, then at the end you assume the worst case and
take their union. Loops are handled the same way: you step through the
loop body once, then assume the worst and reduce the set only if you're
able to prove that *every* loop iteration reduces it. If you can't prove
that, assume the worst and take the unreduced set prior to the loop.
There may be some simple loops (e.g. foreach) where more advanced
reasoning can be applied, but basically you apply special cases where
you can to reduce the value set, and where you can't just assume the
worst / take the conservative assumption.
Then at the end, if you're able to reduce the set, then the missing
elements are provably impossible, and you can reason about whether the
remaining values can possibly overrun bounds.
But as Rikki said, all of this won't be fast. And obviously there will
be cases it cannot catch. So it's a toss-up whether such a thing is
worth implementing in a compiler.
T
--
"You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
More information about the Digitalmars-d
mailing list