C++ static analysis

Fri May 6 12:33:45 PDT 2011

> Through Reddit I've seen another document that shows common little bugs in C/C++
code found by a static analysis tool. The bugs shown seem to be real, from real
software:
> http://www.slideshare.net/Andrey_Karpov/add2011-en
>
> From the slides I have selected seven of the bugs, here translated to D2:
>
>
> enum uint FLAG1 = 0b001;
> void main() {
>     int x = 1000;
>     int y = 1;
>     if (!y & x) {} //#1
>
>     uint flags;
>     if (flags && FLAG1) {} //#2
>
>     if (y == 0 || 10 || y) {} //#3
>
>     int i;
>     for (i = 0; i < 10; i++)
>         for (i = 1; i < 5; i++) {} //#4
>
>     if (x || x) {} //#5
>
>     if (x) y = 1; else y = 1; //#6
>
>     auto z = x * y ? 10 : 20; //#7
> }
>
>
> Notes about the bugs:
> - In #1 the programmer meant to write if(!(y & x)). This is often a bug.
Actually

> - In #2 the programmer probably meant to use & instead of && to intersect the flags.
> - In #3 the boolean condition is always true, and y is ignored.
> - In #4 there are two nested loops with the same index variable.
> - In #5 the two conditions in or are the exactly the same. When this happens,
it's probably a bug.
> - In #6 the two branches of the if are exactly the same. When this happens, it's
probably a bug.
> - In #7 the programmer may have misunderstood the precedence levels between ?:
operator and the * (mult) operator.
>
> Notes about D2:
> - Bug #1 is common, easy to catch statically, so I think it's worth catching by
D2 language (we have discussed this bug in past too, but the catching code is not
present yet). I think D2 may require parentheses here.
> - Regarding #2, that static analysis tool catches it too, it seems, but I don't
know how to catch it, or if it's worth catching. But it's probably a not so
uncommon bug.

This is actually #3 in disguise.

> - #3 is an example of always true or always false boolean expression (that's not
0/1 or true/false). It's a case that asks for considerations similar of #5 ones.

The problem is that generated/CTFE'd code might produce that kind of redundancy. I
think it would be possible to just catch plain user-written literals, but then,
where exactly to make the cut, considering cases like #2?

> - Bug #4 is probably easy to find, but in D is less common, because you usually
use foreach, or for with local index variable (that D doesn't allow you to
redefine). Once in > a while that kind of code is correct, maybe.

This is already disallowed (deprecated).

> - Bug #5 and #6 are similar. Simular redundancies often are bugs. But I don't
know if D2 wants to actually turn this correct code into a real compile-time
error. But it's worth thinking more about it.

Proving that two sub-expressions are equivalent is a hard task. Has anyone ever
experienced a bug that was hard to find similar to those toy examples? Well, maybe
if(x||y||z||a||b||x||d){/*do stuff*/}, but who writes such code?

> - Bug #7 is easy to catch, and the fix is to ask for more parentheses. But I
think it's not a so common bug.

You can write (most) code without ?:. But if a programmer wants to use it, he has
to be familiar with its precedence rules (very simple, lower precedence than
anything but = and ,). You cannot "fix" it, because any code relying on the
precedence (about any code that uses ?: without using many excessive parens) will
get broken, which includes most of my programs. It would be possible to disallow
?: outside a (), [], ',' (comma), or ?: expression by just a small change in
grammar, so it would be feasible. I as an excessive user of ?: am very much
against it though.

A little anecdote: Apparently, ?: are not only a source of bugs in user code, but
in compilers as well:

int a,b,c,d,e;
int*p=&(a?b?c:d:e);

See http://d.puremagic.com/issues/show_bug.cgi?id=5799.

>
> So:
> - I suggest to change D2 to catch #1 statically.
I think requiring parentheses (providing undefined or ambiguous precedence rules)
is ugly. D has done it before and it seems reasonable to do so, but still ugly. It
also implies that the precedence rules would be a design mistake if not for
compatibility with C. What is the great catch about having & defined on (bool,
int) anyways?

> - I encourage you to think more about what to do with bugs #3, #5, #6 (there is
another related bug, with class/struct members assignments,
http://d.puremagic.com/issues/show_bug.cgi?id=3878 ).
> - Maybe think about fixing #7 too. ?: operators are a rich source of bugs anyway.
What property makes them so bug-prone? I think they are quite easy to use.

>
> Bye,
> bearophile

Timon