What does Coverity/clang static analysis actually do?
bearophile
bearophileHUGS at lycos.com
Mon Oct 5 17:24:12 PDT 2009
I am Sorry for my delay for this small answer...
Walter Bright:
>5. dead code (code that can never be executed)<
In particular whole unused functions, methods, classes, modules, packages.
Also rather important unused constants, variables.
GCC has a function attribute named "used" that tells the optimizer to keep a function in the binary even if it appears to be unused (because it may be used in strange ways from asm, etc).
>The =void case is rare enough to be irrelevant.<
In good D code it's very uncommon, but spotting such bugs too is positive.
(And eventually D variables may not be initialyzed anymore?)
>6. Arrays are solidly covered by a runtime check. There is code in the optimizer to detect many cases of overflows at compile time, but the code is currently disabled because the runtime check covers 100% of the cases.<
Spotting such bugs at compile time, where possible, is 200% better (if such test is reliable).
>8. D2 has acquired some decent checking for this.<
In my opinion it's not enough, some optional runtime tests too are be useful.
>More along these lines will be interesting.<
Sure. There's a lot that can be done still.
Other possible bugs:
- optional runtime errors when a pointer goes out of the allocated memory (C# does something like this).
- safety: people that try to mess with the stack to try to break the safety of a program. Also stack canaries (now they can be used wuith LLVM).
- errors in indentation, when an if-then-else has an indentation that's the opposite of the its semantics. New GCC can spot this problem.
- finding where a structure is modified, but such changes will not influence the program, because the structure is copied by value. This is tricky, but I've fallen in this trap, for example modifying a struct array, if you don't use "ref" you will not change the original array. If inside a function that takes a string in input you change the string length, such string will not be seen outside the function unless you mark the string as "ref", etc.
- Spotting bugs in "switch", for example where a programmer may have forgotten a break. I don't know if this can be done, probably not.
- There are many other possibile bugs. I suggest to read online lists of common bugs in C/C++/Java/C# programs, and to try to give the compiler ways to spot some of them. Lints like Splint and gimpel are designed for this, see below.
- disallowing implicit signed=>unsigned conversions.
You can also take a look at Splint:
http://www.splint.org/
Splint also shows that adding few annotations in the source code can save from many bugs. Annotations for example that assert that a pointer is never null, etc.
And gimpel:
http://www.gimpel.com/
The bug of the month now is interactive, so you can use it to test how good this software is:
http://www.gimpel-online.com/bugsLinkPage.html
The very large list of possible bugs/problems spotted by gimpel software:
http://gimpel-online.com/MsgRef.html
The Cyclone language is a safe variant of C, it can give some ideas.
C#4 too has surely some safety ideas that D is missing still.
For example warning every time a variable is supposed to be converted automatically into another type of variable with some information loss (eg. double => float, etc).
Bye,
bearophile
More information about the Digitalmars-d
mailing list