Postmortem: Template unittests are bad & you shouldn't catch Error
Mathias LANG
geod24 at gmail.com
Thu Oct 22 04:04:26 UTC 2020
Hi everyone,
As some of you may know, my company is developing a blockchain
where the node is written solely in D
(https://github.com/bpfkorea/agora).
One of the reason to pick D was the ease of prototyping and the
C++ compatibility (we need to interface with modern C++ code,
e.g. C++14 and 17).
Since tests matters, we have developed a testing framework which
use(d) `std.concurrency` to simulate a network in memory (1
thread == 1 node), which works *very well*.
However, there was a little gotcha: When an error happens, the
thread would crash, and we would get a timeout in the main
thread, and sometimes would not be able to print the logs. We use
Ocean's Logger, which is derived from Tango's, which allow to
essentially set the output, and in test mode we set it to a large
static array acting as a circular buffer.
One of our developer was hit particularly hard by this while
implementing a new feature, so added a bunch of `writeln`. After
discussion with the team, the solution seemed obvious: Let's use
`assertHandler` !
For those of you not familiar with it, here is `assertHandler`'s
documentation:
https://dlang.org/library/core/exception/assert_handler.html
It essentially allows to define a custom behavior on `assert`. It
is slightly redundant with other facilities to do so, such as
`-checkaction`, which allow to customize the behavior at the
compiler level.
That's when things went bad. Since we wanted a core dump, we
added an `abort` in our handler, expecting that it will only
trigger when we have bugs. And our tests started to fail. And
that's where the rant begins.
We got hit by this (reduced) stack trace:
```
[/usr/local/Cellar/ldc/1.23.0/include/dlang/ldc/std/typecons.d:2979] Assertion thrown during test: Called `get' on null Nullable!int.
----------------
??:? nothrow void
agora.test.Base.testAssertHandler(immutable(char)[], ulong,
immutable(char)[]) [0x107807564]
??:? onAssertErrorMsg [0x10838ea21]
??:? _d_assert_msg [0x10838ee35]
??:? inout pure nothrow ref @property @nogc @safe inout(int)
std.typecons.Nullable!(int).Nullable.get() [0x1079146d5]
??:? inout pure nothrow ref @property @nogc @safe inout(int)
std.typecons.Nullable!(int).Nullable.get_() [0x107911958]
??:? pure nothrow @nogc @safe int
std.typecons.Nullable!(immutable(char)[]).Nullable.__unittest_L2998_C13_1().__dgliteral1() [0x10791193c]
??:? pure void
std.exception.assertThrown!(core.exception.AssertError,
int).assertThrown(lazy int, immutable(char)[], immutable(char)[],
ulong) [0x10791177b]
??:? pure void
std.typecons.Nullable!(immutable(char)[]).Nullable.__unittest_L2998_C13_1() [0x1079116b8]
??:? agora.common.Serializer.__unittest [0x107bf97d8]
??:? void
agora.test.Base.customModuleUnitTester().runTest(agora.test.Base.customModuleUnitTester().ModTest) [0x107807d46]
??:? core.runtime.UnitTestResult
agora.test.Base.customModuleUnitTester() [0x10780712d]
```
The failing test is this:
https://github.com/dlang/phobos/blob/8fe6a2762ae1b8b5ea906f15049a4373afbb45b5/std/typecons.d#L3000-L3015
There's so many things wrong here that I don't even know where to
start.
First, why on god's green earth are we putting unittests with
explicit template instantiation inside a *template* ? `unittest`
inside templates are a terrible "feature", and I've never seen a
correct usage of it, which would be a unittest that relies on the
user-instantiated instance to test that its expectation still
holds.
I'm pretty sure the rationale is "for documentation", and we
should really re-think this, because those unittests are being
run from *every single module that instantiate Nullable*.
Actually, they are being run for *every instantiation of
Nullable*.
Second, as the title of the post says, *do not catch Error*.
Catching `Error` is a sin, a crime, and every catch of `Error`
derivative should have you outraged. It is in direct violation of
the specs. And obviously, it's also incompatible with scheme that
do not throw an `Error`.
The code was introduced 5 years ago
(https://github.com/dlang/phobos/pull/3114) and has been sitting
there since then.
Another thing that contributed to our issue was that we didn't
really knew where this instantiation came from. The module that
we see in the stack trace, Serializer, has *no* reference to
`Nullable`, and doesn't import `typecons`. It does import some
other modules, so I'm conjecturing that the dependencies'
unittests are executed first, but I'll take that up with the LDC
team after I investigate.
In any case, there's two things I really wish we fixed:
- Either ban unittests in templates, deprecate them, or find a
way to make this mistake hard to do;
- Deprecate the ability to catch `Error` in code. If it's not
supposed to be caught, it shouldn't be so easy to do it;
More information about the Digitalmars-d
mailing list