Documented unittests & code coverage
Atila Neves via Digitalmars-d
digitalmars-d at puremagic.com
Thu Aug 4 12:04:19 PDT 2016
On Thursday, 4 August 2016 at 10:24:39 UTC, Walter Bright wrote:
> On 8/4/2016 1:13 AM, Atila Neves wrote:
>> On Thursday, 28 July 2016 at 23:14:42 UTC, Walter Bright wrote:
>>> On 7/28/2016 3:15 AM, Johannes Pfau wrote:
>>>> And as a philosophical question: Is code coverage in
>>>> unittests even a
>>>> meaningful measurement?
>>>
>>> Yes. I've read all the arguments against code coverage
>>> testing. But in my
>>> usage of it for 30 years, it has been a dramatic and
>>> unqualified success in
>>> improving the reliability of shipping code.
>>
>> Have you read this?
>>
>> http://www.linozemtseva.com/research/2014/icse/coverage/
>
> I've seen the reddit discussion of it. I don't really
> understand from reading the paper how they arrived at their
> test suites, but I suspect that may have a lot to do with the
> poor correlations they produced.
I think I read the paper around a year ago, my memory is fuzzy.
From what I remember they analysed existing test suites. What I
do remember is having the impression that it was done well.
> Unittests have uncovered lots of bugs for me, and code that was
> unittested had far, far fewer bugs showing up after release.
> <snip>
No argument there, as far as I'm concerned, unit tests = good
thing (TM).
It think measuring unit test code coverage is a good idea, but
only so it can be looked at to find lines that really should have
been covered but weren't. What I take issue with is two things:
1. Code coverage metric targets (especially if the target is
100%). This leads to inane behaviours such as "testing" a print
function (which itself was only used in testing) to meet the
target. It's busywork that accomplishes nothing.
2. Using the code coverage numbers as a measure of unit test
quality. This was always obviously wrong to me, I was glad that
the research I linked to confirmed my opinion, and as far as I
know (I'd be glad to be proven wrong), nobody else has published
anything to convince me otherwise.
Code coverage, as a measure of test quality, is fundamentally
broken. It measures coupling between the production code and the
tests, which is never a good idea. Consider:
int div(int i, int j) { return i + j; }
unittest { div(3, 2); }
100% coverage, utterly wrong. Fine, no asserts is "cheating":
int div(int i, int j) { return i / j; }
unittest { assert(div(4, 2) == 2); }
100% coverage. No check for division by 0. Oops.
This is obviously a silly example, but the main idea is: coverage
doesn't measure the quality of the sentinel values. Bad tests
serve only as sanity tests, and the only way I've seen so far to
make sure the tests themselves are good is mutant testing.
Atila
More information about the Digitalmars-d
mailing list