Carmack about static analysis

Sat Dec 24 04:42:41 PST 2011

A new blog post by the very good John Carmack, I like how well readable this post is:
http://altdevblogaday.com/2011/12/24/static-code-analysis/

If you don't know who he is:
http://en.wikipedia.org/wiki/John_Carmack

Beside the main point of the article that is to use more static analysis tools, here are some comments:

>Anything that isn't crystal clear to a static analysis tool probably isn't clear to your fellow programmers, either.<

It seems that elsewhere Carmack has said something related:
>if the compiler can't figure out that it's safe, then it's probably hard for a human to figure that out too, and this likely to get it wrong.<

But human minds and the logic of lint programs are two cognitive (or pre-cognitive) systems that work in very different ways. Often what's obvious and easy for a mammalian brain can't be seen by a lint tool and what a lint tool finds easy to spot is hard to find for a human brain. So they are both needed. The difference in the way computers and people "think" is one of reasons of the usefulness of computers for our society.

Here there are some bad examples about /Analyze:
http://randomascii.wordpress.com/2011/09/13/analyze-for-visual-studio-the-ugly-part-5/
In the same blog you see many other similar cases.

>NULL pointers are the biggest problem in C/C++, at least in our code.  The dual use of a single value as both a flag and an address causes an incredible number of fatal issues.<

A big problem not solved in D.

>Printf format string errors were the second biggest issue in our codebase, heightened by the fact that passing an idStr instead of idStr::c_str() almost always results in a crash, but annotating all our variadic functions with /analyze annotations so they are properly type checked kills this problem dead.<

A problem that people didn't want to reduce in D.

>A lot of the serious reported errors are due to modifications of code long after it was written.  An incredibly common error pattern is to have some perfectly good code that checks for NULL before doing an operation, but a later code modification changes it so that the pointer is used again without checking.  Examined in isolation, this is a comment on code path complexity, but when you look back at the history, it is clear that it was more a failure to communicate preconditions clearly to the programmer modifying the code.<

D has a space reserved for preconditions, so I think this is a smaller problem in D compared to C++.

>There was a paper recently that noted that all of the various code quality metrics correlated at least as strongly with code size as error rate, making code size alone give essentially the same error predicting ability.  Shrink your important code.<

In theory functional-style is a good to shrink the code, but in D functional-style code is a jungle of (({}){()}) so it's hard to write and hard to read.

Bye,
bearophile