What does Coverity/clang static analysis actually do?

Thu Oct 1 20:08:40 PDT 2009

 "Walter Bright" <newshound1 at digitalmars.com> wrote in message 
news:ha37cf$4l2$1 at digitalmars.com...
> Nick Sabalausky wrote:
>>> 3. use of uninitialized variables (no reaching definition)
>>> 3. Optimizer detects and reports it. Irrelevant for D, though, because 
>>> variables are always initialized. The =void case is rare enough to be 
>>> irrelevant.
>>>
>>
>> D variable default-initialization is absolutely no different from your 
>> scenario of a programmer blindly tossing in =0 to shut up a compiler, 
>> *except* that the programmer is never even given the opportunity to do 
>> the right thing. This is *bad*. I *want* variables that haven't meen 
>> manually inited to be statically treated as uninited. C# does this and it 
>> works great.
>
> The difference is the maintenance programmer won't be left puzzling why 
> there is an explicit assignment to the variable that is never used.

Why would there be one? If a value is assigned but never used, then a "must 
assign value before reading" rule wouldn't have tripped over it before the 
initial programmer wrote the assignment in the first place.

> The point to default initialization is consistency in the resulting 
> behavior.

Yes, I'm aware that D's default-initing came about as a better alternative 
to C's "let's introduce unreliable randomness" approach. But we're way past 
that and no one's comparing it to C. But more importantly, it does nothing 
to address my argument that compares it to a more C#-way. I'll explain with 
a scenario you may initially find very familiar:

Johnny is using a language that statically requires variables to have been 
verifiably written to before they can be read. Johnny compiles and gets the 
error "Var 'x' read without having been written to". Johnny jerks his knee 
and blindly throws in "=0". "Bad Johnny. Bad, bad Johnny." scolds Kenneth 
Compiler Writer quite justifiably. Two seconds later, Kenneth changes the 
compiler to blindly throw in "=0". Kenneth is a big hypocrite. Johnny smiles 
because his bad behavior is now automated.

Notice that story has no relevance to C's way of handling (or rather 
ignoring) uninitialized vars. It's D-style vs C#-style, not D-style vs 
C-style. So can you please address how the current D-style is supposed to be 
better than *C#-style*, rather than once again telling us it's better than 
C-style. We know it's better than C-style. That's not the issue.

> Also, the optimizer will remove nearly all of the default initializers if 
> they are dead assignments.
>
> Anyhow, I think this issue was beaten to death in the previous thread on 
> null dereference. I don't wish to divert this thread into rediscussing it, 
> but rather stick with what other kinds of bug-detecting data flow analyses 
> there are?
>
>
>>> 4. dead assignments (assignment of a value to a variable that is never 
>>> subsequently used)
>>> 4. Dead assignments are automatically detected and removed. I'm not 
>>> convinced this should be reported, as it can legitimately happen when 
>>> generating source code. Generating false positives annoy the heck out of 
>>> users.
>>>
>>
>> I'll agree with you here. But it might be nice to have an option to just 
>> simply report them anyway for when the programmer wants to see if there's 
>> any of these around that he can clean up.
>
> I congenitally dislike optional warnings, as I've pontificated at length 
> about here before <g>. The problem is it makes for a wishy-washy 
> definition of the language, and muddies what is legal versus illegal.

The way you've implemented them (as errors rather than real warnings), yes, 
it absolutely does that, as I've pontificated about at length <g>.

> D needs to advance the state of the art with clear thinking about what is 
> legal and what isn't. Warnings and false positives are failures of 
> language design.
>

If you define a failure of language design to be anything that isn't total 
perfection, then sure.

Warnings and false positives are measures that are taken to mitigate 
problems from the *inevitable* imperfections. Many of the imperfections in 
language design are like cliffs, they're potentially hazardous and you can't 
always *realistically* eliminate them. So when we can't realistically 
eliminate them, what we do is we swallow our pride and we put up a railing, 
warning and even a potential false positive.

In fact, D has *already* conceded to being imperfect and accordingly added 
false positives and...well..something warning-like (but more problematic):

- There *are* situations where an implicit narrowing cast would be perfectly 
OK. But does D warn about these false positives for the sake of the 
narrowing casts that aren't OK? Absolutely. And in these cases it forces the 
programmer to cast...but what does "cast(byte)someInt" really do if not your 
"Evil #2" below?

- No return at end of function. It should be pretty clear why a non-void 
function with no return is potentially hazardous. It's also clear that there 
are certain times it would just happen to work out just fine, for instance, 
to just assume "0" or something. But, for damn good reason, a warning got 
tossed in along with false positives. Would it have been better to just make 
it an error? Maybe. But (probably because of the false positive) we couldn't 
agree on that, so what was deemed the best solution? A warning.

And ok, maybe you do consider those things evils. But what else are you 
going to do? Let people just fall off the ledge? No. You concede that it's 
not perfect and accept the warnings and false-positives, as you've already 
done.

Or maybe there *is* a better solution to all those cases. Ok, so what do you 
do in the meantime before it's implemented? Let people fall off the ledge? 
No. You concede that it's not perfect yet and accept the warning and 
false-positives, as you've already done.

But what you *don't* do in either situation is let people trip off the ledge 
and excuse it with "well yea, but a railing would indicate it's not a 
perfect park".

Exceptions are there to help handle runtime errors. But if we didn't wrote 
programs that had error conditions, we wouldn't need exceptions. So 
exceptions must be indicative of failed program design, so let's consider 
exceptions bad and make a fuss everytime someone on our project wants a new 
one created.

Trying to minimize warnings and false positives because they are indicative 
of language design flaws is like trying to minimize safety features on a car 
because they're indicative of a potential for danger, or more iconically, 
like getting rid of all the doctors because they're indicative of illness.

I think part of the problem may be that the ideals of safety, eliminating 
false positives, and a pure "allowed/disallowed" separation have a natural 
tendency to be at odds with each other. Good language design takes 
non-problematic things and allows them, and takes problematic things and 
disallows them. But the problem is, there are very few things, in real life 
or in code, that either always result in a problem or never result in a 
problem. And, even worse, there's a lot that lives somewhere in between. So, 
with only precious few exceptions (if any even exist), every time you allow 
something, you're inherently creating a danger, and every time you disallow 
something, you're inherently creating false positive. And if you don't do 
either you get wishy-washy.

You can't win a game like that, and it's mistake to expect to. The best that 
can be done is make sure your decisions match the how you weigh those values 
(safety, convenience, uniformity). And personally, I weigh safety the 
highest since a lack of it tends to have the worst consequences.

> I generally regard as evil:
>
> 1. bugs that have erratic, random, irreproducible symptoms
>

No argument here :)

> 2. source code that doesn't have an obvious purpose (this includes code 
> inserted solely to suppress a false positive or warning)
>

I don't see what's so evil about that, other than that it's connected to 
false positives and warnings, which you seem to already regard as evil.

> I regard as undesirable:
>
> 3. wishy-washy warnings
>
> 4. overzealous compiler messages that are more akin to nagging than 
> finding actual bugs

If you accept the idea of a compiler (like DMD) having rudimentary built-in 
optional versions of normally separate tools like profiling, unittesting, 
doc generation, etc., and you accept that lint tools are perfectly fine 
tools to use (as I think you do, or am I mistaken?), then I don't see what 
would make lint tools an exception to the "built-ins are ok" attitude 
(especially since a separate one would require a lot of redundant 
parsing/analysis.)