Clang static analysis results for dmd

bearophile bearophileHUGS at lycos.com
Thu Jul 28 15:27:05 PDT 2011


Walter:

>The signal to noise ratio of this kind of flow analysis is rather poor. While it does find some legitimate bugs, the rate of false positives is far too high to be a standard part of the language.<

I don't fully understand your answers on this. I am a bit confused, but it's not your fault.

In Clang those tests aren't a standard part of the C or C++ languages. They are extra tests, like a lint tool built in the compiler, and they aren't a part of the normal compilation (if you use --analyze it doesn't produce a compiled binary, but an HTML of the test results).

I take a look at a random sample of the first groups of the results:

---------------------

Some Dead code, Idempotent operation:

uRegmask3 = ASM_GET_uRegmask(popnd3->usFlags);
Value stored to 'uRegmask3' is never read

ty = ta->ty;
Value stored to 'ty' is never read

---------------------

Some Dead code, dead assignment:

uSizemask3 = ASM_GET_uSizemask(popnd3->usFlags);	
Value stored to 'uSizemask3' is never read

s = retregs & mES;	
Value stored to 's' is never read

---------------------

Some Dead code, Dead increment:

offset += vtblInterfaces->dim * (4 * PTRSIZE);
Value stored to 'offset' is never read

flags |= 1; // already deduced, so don't to toHeadMutable()
Value stored to 'flags' is never read

---------------------

Dead store	Dead initialization:

TY tyto = t->toBasetype()->ty;
Value stored to 'tyto' during its initialization is never read

int aimports_dim = aimports.dim;	
Value stored to 'aimports_dim' during its initialization is never read

---------------------

Logic error	Assigned value is garbage or undefined:


Parameters *FuncDeclaration::getParameters(int *pvarargs)
2918	{ Parameters *fparameters;
2919	int fvarargs;
2920	
2921	if (type)
	
1
	Taking false branch
2922	{
2923	assert(type->ty == Tfunction);
2924	TypeFunction *fdtype = (TypeFunction *)type;
2925	fparameters = fdtype->parameters;
2926	fvarargs = fdtype->varargs;
2927	}
2928	if (pvarargs)
	
2
	Taking true branch
2929	*pvarargs = fvarargs;
	
3
	Assigned value is garbage or undefined

---------------------

Logic error	Dereference of undefined pointer value:

STATIC void ivfamelems(register Iv *biv,register elem **pn)
2447	{ register unsigned op;
2448	register tym_t ty,c2ty;
2449	register famlist *f;
2450	register elem *n,*n1,*n2;
2451	
2452	assert(pn);
2453	n = *pn;
2454	assert(biv && n);
2455	op = n->Eoper;
2456	if (OTunary(op))
	
1
	Taking true branch
2457	{ ivfamelems(biv,&n->E1);
2458	n1 = n->E1;
2459	}
2460	else if (OTbinary(op))
2461	{ ivfamelems(biv,&n->E1);
2462	ivfamelems(biv,&n->E2); /* LTOR or RTOL order is unimportant */
2463	n1 = n->E1;
2464	n2 = n->E2;
2465	}
2466	else /* else leaf elem */
2467	return; /* which can't be in the family */
2468	
2469	if (op == OPmul || op == OPadd || op == OPmin ||
	
2
	Taking true branch
2470	op == OPneg || op == OPshl)
2471	{ /* Note that we are wimping out and not considering */
2472	/* LI variables as part of c1 and c2, but only constants. */
2473	
2474	ty = n->Ety;

2485	
2486	/* If we have (li + var), swap the leaves. */
2487	if (op == OPadd && isLI(n1) && n1->Eoper == OPvar && n2->Eoper == OPvar)
	
3
	Dereference of undefined pointer value
	
--------------------

2819	targ_ldouble el_toldouble(elem *e)
2820	{ targ_ldouble result;
2821	
2822	elem_debug(e);
2823	assert(cnst(e));
2824	#if TX86
2825	switch (tybasic(typemask(e)))
	
1
	'Default' branch taken. Execution continues on line 2860

2860	return result;
	
2
	Undefined or garbage value returned to caller
2861	}

-----------------------

Is Clang correct there, or are those false positives? If it's correct then I'd like the D compiler to tell me 100% of those I have listed here, even if not even one of those is a real bug. In some cases you store a value in a variable even if you know you will not use it (example: last iteration of a loop, to code simpler and shoter. But I'd like to know every time I do this. I like to write tidy code.

Returning values that can be undefined is less easy to catch in D, because the language initializes variables, to their initial default value is sometimes what the programmer wants.


>I would agree that adding extra conditionals to the source code will both eliminate the false positives and make the code more readable, but those extra conditionals exact a performance penalty and would not be something a high performance coder would want. I originally did have stuff like this in the optimizer, but removed it because the false positive rate was untenable.<

I don't fully understand what you are saying, but is __assume useful here?
http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx

Bye,
bearophile


More information about the Digitalmars-d mailing list