Testing D database calls code for regression

H. S. Teoh hsteoh at quickfur.ath.cx
Mon Mar 19 00:56:26 UTC 2018


On Sun, Mar 18, 2018 at 07:51:18PM +0000, aberba via Digitalmars-d-learn wrote:
> On Friday, 16 March 2018 at 21:15:33 UTC, H. S. Teoh wrote:
> > On Fri, Mar 16, 2018 at 08:17:49PM +0000, aberba via Digitalmars-d-learn
> > wrote:
> > > [...]
> > 
> > The usual way I do this is to decouple the code from the real
> > database backend by templatizing the database driver.  Then in my
> > unittest I can instantiate the template with a mock database driver
> > that only implements the bare minimum to run the test.
> > 
> > [...]
> 
> Mocking a fake database can also be huge pain. Especially when
> something like transactions and prepared statements are involved.

It depends on what your test is looking for.  The idea is that the mock
database only implements a (small!) subset of a real database, basically
just enough for the test to run, and nothing more.  Of course, sometimes
it may not be easy to do this, if the code being tested is very complex.


> Imagine testing your mock for introduced by future extension.

If you find yourself needing to test your mock database, then you're
doing it wrong.  :-D  It's supposed to be helping you test your code,
not to create more code that itself needs to be tested!

Basically, this kind of testing imposes certain requirements on the way
you write your code. Certain kinds of code are easier to test than
others.  For example, imagine trying to test a complex I/O pipeline
implemented as nested loops. It's basically impossible to test it except
as a blackbox testing (certain input sets must produce certain output
sets). It's usually impractical for the test to target specific code
paths nested deep inside a nested loop. The only thing you can do is to
hope and pray that your blackbox tests cover enough of the code paths to
ensure things are correct. But you're likely to miss certain exceptional
cases.

But if said I/O pipeline is implemented as series of range compositions,
for example, then it becomes very easy to test each component of the
range composition. Each component is decoupled from the others, so it's
easy for the unittest to check all code paths. Then it's much easier to
have the confidence that the composed pipeline itself is correct.

I/O pipelines are an easy example, but understandably, in real-world
code things are rarely that clean.  So you'll have to find a way of
designing your database code such that it's more easily testable.
Otherwise, it's going to be a challenge no matter what.  No matter what
you do, testing a function made of loops nested 5 levels deep is going
to be very hard.  Similarly, if your database code has very complex
interdependencies, then it's going to be hard to test no matter how you
try.

Anyway, on the more practical side of things, depending on what your
test is trying to do, a mock database could be as simple as:

	struct MockDb {
		string prebakedResponse;
		auto query(string sql) {
			if (sql == "SELECT * FROM data")
				return prebakedResponse;
			else if (sql == "UPDATE * SET ... ")
				prebakedResponse = ...
			else
				assert(0, "Time to rewrite your unittest :-P");
		}
	}

I.e., you literally only need to implement what the test case will
actually invoke. Anything that isn't strictly required is fair game to
just outright ignore.

Also, keep in mind that MockDb can be a completely different thing per
unittest. Trying to use the same mock DB for all unittests will just end
up with writing your own database engine, which kinda defeats the
purpose. :-P  But the ability to do this depends on how decoupled the
code is.  Code with complex interdependencies will generally give you a
much harder time than more modular, decoupled code.


T

-- 
Knowledge is that area of ignorance that we arrange and classify. -- Ambrose Bierce


More information about the Digitalmars-d-learn mailing list