Serious Problems with the Test Suite
Walter Bright
newshound2 at digitalmars.com
Wed Jun 17 23:59:52 UTC 2020
A good test suite should:
1. verify that things that are supposed to work do work
2. when things don't verify, point to where the problem is
The D test suite fails miserably at point 2. The only bright spot is the
autotester, where when one of the tests fail it's quick to find the problem source.
But I cringe every time something else fails, because then I know I'm in for
hours or even DAYS trying to figure out what and where things went wrong.
For example,
https://github.com/dlang/dmd/pull/11287
has several failures. All of which come with USELESS log files. I have no idea
what went wrong. Some principles for log files:
1. If the log file says ERROR, it should be an ERROR, i.e. the test should fail.
I'm often confronted with log files that list multiple ERRORe, but never mind,
those errors don't need to pass. All benign ERROR messages, all deprecation
messages, all warning messages need to be fixed, so what when the log file says
ERROR that's why the test failed.
2. The ERROR that causes the test to fail should be LAST line in the log file,
not 300 lines back.
3. Log files need to contain comment text at each step to SAY WHAT THEY ARE DOING.
4. Makefiles should NEVER, EVER be run in "quiet" mode, for the simple reason
that one has no idea what it was trying to do when it failed.
5. Test files must either include a URL to the bugzilla issue they fix or have
some clue in the comments what they are doing.
6. Running tests multi-process makes them go faster, but since the log files
randomly interleave the output from them, it makes it impossible to figure out
where the failure is.
7. Any test that fails because of a network error, or other environmental error
unrelated to what is being tested, should automatically sleep for a minute or
ten, then try again.
8. Any timeout terminations MUST say which test timed out.
9. Tests should not be Rube Goldberg Machines with layers and layers of
complexity before the actual test is even run. Tests should be a THIN layer over
the test.
10. Many tests are UTTERLY UNDOCUMENTED. For example,
https://github.com/dlang/dmd/tree/master/test/unit
What is that? What does it do? Is it one test or many tests? Let's look at:
https://github.com/dlang/dmd/blob/master/test/unit/frontend.d
Not a SINGLE COMMENT in it. What it is, what it does, etc., is all left to the
imagination. This is completely unacceptable for production code, it is also
unacceptable for any code accepted into the D repository.
11. Every time we run into "oh, that's just a heisenbug, try re-running the
test" that is a BUG in the test suite and needs to be fixed. Those are gigantic
time and resource wasting problems.
More information about the Digitalmars-d
mailing list