Serious Problems with the Test Suite

Wed Jun 17 23:59:52 UTC 2020

A good test suite should:

1. verify that things that are supposed to work do work

2. when things don't verify, point to where the problem is

The D test suite fails miserably at point 2. The only bright spot is the 
autotester, where when one of the tests fail it's quick to find the problem source.

But I cringe every time something else fails, because then I know I'm in for 
hours or even DAYS trying to figure out what and where things went wrong.

For example,

https://github.com/dlang/dmd/pull/11287

has several failures. All of which come with USELESS log files. I have no idea 
what went wrong. Some principles for log files:

1. If the log file says ERROR, it should be an ERROR, i.e. the test should fail. 
I'm often confronted with log files that list multiple ERRORe, but never mind, 
those errors don't need to pass. All benign ERROR messages, all deprecation 
messages, all warning messages need to be fixed, so what when the log file says 
ERROR that's why the test failed.

2. The ERROR that causes the test to fail should be LAST line in the log file, 
not 300 lines back.

3. Log files need to contain comment text at each step to SAY WHAT THEY ARE DOING.

4. Makefiles should NEVER, EVER be run in "quiet" mode, for the simple reason 
that one has no idea what it was trying to do when it failed.

5. Test files must either include a URL to the bugzilla issue they fix or have 
some clue in the comments what they are doing.

6. Running tests multi-process makes them go faster, but since the log files 
randomly interleave the output from them, it makes it impossible to figure out 
where the failure is.

7. Any test that fails because of a network error, or other environmental error 
unrelated to what is being tested, should automatically sleep for a minute or 
ten, then try again.

8. Any timeout terminations MUST say which test timed out.

9. Tests should not be Rube Goldberg Machines with layers and layers of 
complexity before the actual test is even run. Tests should be a THIN layer over 
the test.

10. Many tests are UTTERLY UNDOCUMENTED. For example,

https://github.com/dlang/dmd/tree/master/test/unit

What is that? What does it do? Is it one test or many tests? Let's look at:

https://github.com/dlang/dmd/blob/master/test/unit/frontend.d

Not a SINGLE COMMENT in it. What it is, what it does, etc., is all left to the 
imagination. This is completely unacceptable for production code, it is also 
unacceptable for any code accepted into the D repository.

11. Every time we run into "oh, that's just a heisenbug, try re-running the 
test" that is a BUG in the test suite and needs to be fixed. Those are gigantic 
time and resource wasting problems.