Postmortem: Template unittests are bad & you shouldn't catch Error

Mathias LANG geod24 at gmail.com
Thu Oct 22 04:04:26 UTC 2020


Hi everyone,
As some of you may know, my company is developing a blockchain 
where the node is written solely in D 
(https://github.com/bpfkorea/agora).
One of the reason to pick D was the ease of prototyping and the 
C++ compatibility (we need to interface with modern C++ code, 
e.g. C++14 and 17).

Since tests matters, we have developed a testing framework which 
use(d) `std.concurrency` to simulate a network in memory (1 
thread == 1 node), which works *very well*.

However, there was a little gotcha: When an error happens, the 
thread would crash, and we would get a timeout in the main 
thread, and sometimes would not be able to print the logs. We use 
Ocean's Logger, which is derived from Tango's, which allow to 
essentially set the output, and in test mode we set it to a large 
static array acting as a circular buffer.

One of our developer was hit particularly hard by this while 
implementing a new feature, so added a bunch of `writeln`. After 
discussion with the team, the solution seemed obvious: Let's use 
`assertHandler` !

For those of you not familiar with it, here is `assertHandler`'s 
documentation: 
https://dlang.org/library/core/exception/assert_handler.html
It essentially allows to define a custom behavior on `assert`. It 
is slightly redundant with other facilities to do so, such as 
`-checkaction`, which allow to customize the behavior at the 
compiler level.

That's when things went bad. Since we wanted a core dump, we 
added an `abort` in our handler, expecting that it will only 
trigger when we have bugs. And our tests started to fail. And 
that's where the rant begins.

We got hit by this (reduced) stack trace:
```
[/usr/local/Cellar/ldc/1.23.0/include/dlang/ldc/std/typecons.d:2979] Assertion thrown during test: Called `get' on null Nullable!int.
----------------
??:? nothrow void 
agora.test.Base.testAssertHandler(immutable(char)[], ulong, 
immutable(char)[]) [0x107807564]
??:? onAssertErrorMsg [0x10838ea21]
??:? _d_assert_msg [0x10838ee35]
??:? inout pure nothrow ref @property @nogc @safe inout(int) 
std.typecons.Nullable!(int).Nullable.get() [0x1079146d5]
??:? inout pure nothrow ref @property @nogc @safe inout(int) 
std.typecons.Nullable!(int).Nullable.get_() [0x107911958]
??:? pure nothrow @nogc @safe int 
std.typecons.Nullable!(immutable(char)[]).Nullable.__unittest_L2998_C13_1().__dgliteral1() [0x10791193c]
??:? pure void 
std.exception.assertThrown!(core.exception.AssertError, 
int).assertThrown(lazy int, immutable(char)[], immutable(char)[], 
ulong) [0x10791177b]
??:? pure void 
std.typecons.Nullable!(immutable(char)[]).Nullable.__unittest_L2998_C13_1() [0x1079116b8]
??:? agora.common.Serializer.__unittest [0x107bf97d8]
??:? void 
agora.test.Base.customModuleUnitTester().runTest(agora.test.Base.customModuleUnitTester().ModTest) [0x107807d46]
??:? core.runtime.UnitTestResult 
agora.test.Base.customModuleUnitTester() [0x10780712d]
```

The failing test is this: 
https://github.com/dlang/phobos/blob/8fe6a2762ae1b8b5ea906f15049a4373afbb45b5/std/typecons.d#L3000-L3015

There's so many things wrong here that I don't even know where to 
start.

First, why on god's green earth are we putting unittests with 
explicit template instantiation inside a *template* ? `unittest` 
inside templates are a terrible "feature", and I've never seen a 
correct usage of it, which would be a unittest that relies on the 
user-instantiated instance to test that its expectation still 
holds.
I'm pretty sure the rationale is "for documentation", and we 
should really re-think this, because those unittests are being 
run from *every single module that instantiate Nullable*. 
Actually, they are being run for *every instantiation of 
Nullable*.

Second, as the title of the post says, *do not catch Error*. 
Catching `Error` is a sin, a crime, and every catch of `Error` 
derivative should have you outraged. It is in direct violation of 
the specs. And obviously, it's also incompatible with scheme that 
do not throw an `Error`.

The code was introduced 5 years ago 
(https://github.com/dlang/phobos/pull/3114) and has been sitting 
there since then.

Another thing that contributed to our issue was that we didn't 
really knew where this instantiation came from. The module that 
we see in the stack trace, Serializer, has *no* reference to 
`Nullable`, and doesn't import `typecons`. It does import some 
other modules, so I'm conjecturing that the dependencies' 
unittests are executed first, but I'll take that up with the LDC 
team after I investigate.

In any case, there's two things I really wish we fixed:
- Either ban unittests in templates, deprecate them, or find a 
way to make this mistake hard to do;
- Deprecate the ability to catch `Error` in code. If it's not 
supposed to be caught, it shouldn't be so easy to do it;


More information about the Digitalmars-d mailing list