Summary on unit testing situation

Tue Mar 23 10:25:45 PDT 2010

I have already written one or two times about this topic, but I think summarizing the situation again a little can't hurt. Feel free to ignore this post.

1) D follows Walter's theory that programmers are often lazy or in a rush, they are often not trained to use unit tests (especially if they come from C or C++) and they don't like to learn to use too much complex things. So it's better to put in the D as simple as possible means to perform something useful, in this case to write unit testing. I was already "test-infected" before learning D, so I have used unit tests in D from almost day zero, and I have found them very easy to use, there's very little to learn, just to add some unittest{} spread in modules, filled with normal code and asserts, plus an argument -unittest for the compiler (but catching expected exceptions and testing expected compile-time errors is less easy. I have written a Throws!() function for the first, and I use is() for the second. And I add a comment that tells what I am testing inside a single unittest. Every thing has a separate unittest, to keep things a little more tidy). It can't be simpler than this. So I think Walter was right. And in future I hope to see D code in the wild that uses a good amount of unittests (but I think currently Phobos has not enough tests).

2) Dynamic languages perform less sanity tests on the code, so programmers are trained to write more unit tests. In Python/Ruby you must write unit tests, a good amount of them. In theory in a statically typed language like D you can avoid some tests because the type system catches some problems for you, saving you the time to write some of them. In practice most of the things you have to write in normal D unit tests are not enforced by normal type systems (even a type system like D2 one that's better than Java one). I have seen that the tests that I don't need to put in D unit tests (because the type system catches them) are only the very simple ones. All the other little more complex tests must be written in D unit tests too, as in Python. So the save in time is not much. I write about 2-2.5 lines of tests for every 1 line of code. In Python I write about 2.5-3 lines of code of tests for every 1 line of code. (But in Python I often use doctests that are a way to write tests that's even faster than D unittests).

3) The problem is that D unit tests are a toy. If you start writing programs composed by many modules you want more flexibility. I have written in the past some of the important things missing in D unit testing, and I don't repeat them here, ask me if you want another list. If you take a look at unit test systems in Java or Python or Ruby or C# you can see that D unit testing is not enough for a professional use, they are a toy. For example in the Python standard library there are two different (but they can be joined) unit test systems, and they are both quite more refined than the D one. And people often use a third external library that ties things together, like one called "nose".

How to solve this situation? There can be various possibilities:
I) Remove the built-in unit testing of D, and wait for someone to write an external "professional" unit test system for D. This external unit test system can have not nice/clean syntax/semantics.
II) Keep the built-in unit testing of D, but essentially all serious future programmers will ignore them and use an external unit type system. This wastes code in the compiler (and information in the head of programmers, but not a lot, because the built-in ones are very simple to learn) and has the disadvantages of the solution I too. D newbies will be adviced to avoid built-in unit tests as soon as possible.
III) Keep the built-in unit testing of D, and improve it until it becomes fit for serious usage. This can make the compiler a bit too much complex. Walter has enough to do already with the core of the front-end. Developing and improving a serious unit test system is not too much hard, but it's a full job or almost full job. Another bad thing of this is that unit testing is not set in stone, in ten years someone can invent a better way to do them, at that point it will be hard to change the compiler to have the newer type of tests.
IV) Keep the built-in unit testing of D, keep them almost as simple as they are now, but somehow add hooks and flexibility to allow to external D code to refine *them* as much as needed (this "external" code can be a Phobos module, or it can be a third-part library written by other people, or it can be born as external lib and added to Phobos later, as it happens often in Python, that's why they say it has "batteries included", such batteries often were not born in the std library), so they can be used in professional situations too. This will increase the complexity of the built-in unit tests, but probably not much. It can increase the complexity of the compiler a little, but I think this extra complexity (some reflection, maybe) can be then used for other purposes too.

If nothing will be done then the situation will most likely evolve to the outcome 'II' listed above, because the built-in ones are simply not good enough. (The development of Tango to replace the not good enough Phobos1 is a clear example of this. If the built-in is not good enough for serious usage AND there's no good way to extend/improve its basic structure, then the community of D programmers is forced to refuse it totally and build something better/different. This is what has happened with Tango in D1, and it can naturally happen again with the unit testing).

Among those four solutions the one I like more is the 'IV'. Because it keeps the work of developing the library out of the busy hands of Walter, but produces something that can have nice enough syntax, with a not too much complex compiler, and it probably allows for some future changes in how people do tests. It can also allow to write both very simple unit tests for novices or single-module programs as now, and professional/complex unit tests for harder situations or larger projects.

If you agree with me that the better solution is the IV, then those hooks/reflection have to be designed first.

Bye,
bearophile