What does Coverity/clang static analysis actually do?

Thu Oct 1 19:16:02 PDT 2009

Brad Roberts wrote:
> 1) Rich flow control.  They go well beyond what's typically done by 
> compiliers during their optimization passes.  They tend to be whole-code 
> in scope and actually DO the parts that are hard, like cross expression 
> variable value tracking similar to a couple examples in this thread.  
> Function boundaries are no obstacle to them.  The only obstacle is where 
> source isn't provided.

Modern compiler optimizers (including dmc and dmd) DO do 
cross-expression variable tracking. They just don't often do it 
inter-function (called inter-procedural analysis) because of time and 
memory constraints, not that it is technically more difficult.

C and C++ do have some issues with inter-procedural analysis because the 
compiler view of them, at heart, is single file. D is much more 
conducive to that because the compiler sees an arbitrarily large part of 
the source code, and can obviously see all of it if the user so desires.

> 2) Due to working with whole source bases, the UI for managing the data 
> produced is critical to overall usability.  A lot of time goes into making 
> it easy to manage the output.. both for single runs and for cross-run flow 
> of data.  Some examples:
> 
>    * suppression of false positives, 

I'd rather do a better job and not have false positives.

>    * graphing of issue trends

That's a crock <g>.

>    * categorization of issue types

I'm not convinced that is of critical value.

> 3) Rule creation.  The core engine usually generates some digested dataset 
> upon rules are evaluated.  The systems come with a builtin set that do the 
> sorts of things already talked about.  In addition they come with the 
> ability to develop new rules specific to your application and business 
> needs.  For example:
> 
>    * tracking of taint from user data
>    * what data is acceptable to log to files (for example NOT credit-cards)

There have been several proposals for user-defined attributes for types, 
I think that is better than having some external rule file.

> 4) They're expected to be slower than compilation, so it's ok to do things 
> that are computationally prohibitive to do during compilation cycles.

I agree.

> ----
> 
> I've seen these tools detect some amazing subtle bugs in c and c++ code.  
> They're particularly handy in messy code.   They can help find memory 
> leaks where the call graphs are arbitrarily obscure.  Sites where NULL 
> pointers are passed into a function that dereferences without a null check 
> even when the call graph has many layers.

Once you get the data flow analysis equations right, they'll detect it 
every time regardless of how subtle or layered the call graph is.

> Yes, rigid contract systems and richer type systems can help reduce the 
> need for some of these sorts of checks, but as we all know, there's 
> tradeoffs.
> 
> 
> That help?

Yes, very much. In particular, I wasn't sure coverity did 
inter-procedural analysis.