Tactics for automated testing

Published by marco on

The article Prefer test-doubles over mocking frameworks by Steve Dunn writes,

“This is testing implementation and not behaviour. Your SUT called something and there is likely an observable side-effect of that. Test the side-effect and not that a particular method was called. If the code is refactored (e.g. you change the implementation but not the behaviour), then your test that checked that a method was called will likely break, but your test that tested the behaviour should remain unchanged and should still pass.”

I think we have to be more careful here. Sometimes you want to test the implementation, no? If you look at the simplest test double that he’s written in the article, shown below, you can see that there is an implicit assumption that would have to be tested: that is, that the Get method in the test-double accurately represents the actual implementation.

This is the interface to be tested.

public interface IProductRepository
{
    void Store(Product product);
    Product Get(int id);
}

This is the test using the test double:

[Fact]
public void Using_test_doubles()
{
    var repo = new InMemoryProductRepository();

    var sut = new ProductService(repo);

    sut.OnboardNewProduct(123, "Product 123");

    repo.DidStore(123).Should().BeTrue();
}

Note that the test calls a test-double-only method called DidStore(), which is assumed to have been implemented as expected. A naive implementation would just return true. Since this is a test double, there are no tests verifying that it doesn’t always return true. Shouldn’t the test instead verify that the product is not stored first—i.e., repo.Get(123) returns false—before calling OnboardNewProduct(123, …) and then testing repo.Get(123) again to verify that it returns true?

The following is the implementation of the test-double.

public class InMemoryProductRepository : IProductRepository
{
    private readonly List<Product> _products = new();

    public void Store(Product product) => _products.Add(product);
    public Product Get(int id) =>  _products.FirstOrDefault(p => p.Id == id);

    // This is not part of the interface, but is useful for testing
    public bool DidStore(int id) => Get(id) is not null;
}

If you leave the test as formulated, there is literally no guarantee that anything changed at all. The author is simply assuming that Store adds a product because he can see that it does.

The author wasn’t quite clear why his mock-based implementation isn’t good, though. He proposed the code below.

[Fact]
public void Using_mocks()
{
    var repo = Substitute.For<IProductRepository>();
    var sut = new ProductService(mock);

    sut.OnboardNewProduct(123, "Product 123");

    repo.Received().Store(Arg.Is<Product>(p => p.Id == 123));
}

Do you see how he checked whether the Store() method had been called rather than testing whether Get(123) returns true? He had to do that because the mock would always return false unless the author had also set up the Get() method to return true if the method were to be called with 123. Why wouldn’t he do this? Because he’d then have just been testing the mock. However, if you look closely at the previous example, the author is also just testing his test-double.

I have another problem with the statement above: sometimes I very much want to verify that a specific method is being called. I’m not trying to verify the behavior of the test-double; I’m trying to verify the behavior of the actual implementation.

If, for whatever reason, I can’t use the actual implementation, then I want to verify that a certain method was called because e.g., I know that that method calls a system API directly. That is, I trust that the system API will do what it says on the tin. I’m able to verify manually that the parameters to the method are passed on to the API faithfully. I can’t call the API in the test suite—maybe it’s a call to the Windows Registry or maybe it’s accessing a USB stick that doesn’t exist in CI—but I can get as close as possible. If something still goes wrong, then I know that I just have to examine the one line of code in the actual implementation. In that way, I’ve verified a fact about the system that means something.

This comes up often enough in more complex component graphs, where you’ve had a bug that, under certain circumstances, a certain notification is not sent. In that case, you might be unable to verify that the message arrives—as we do by testing Get(123) above—because the actual message would go through an online proxy like Apple and would end up on a mobile device somewhere, and maybe you don’t want to build the testing infrastructure that mocks a receiving device that you can check. It wouldn’t help you because you’d just be testing the test-double implementation anyway.

Instead, you would trigger a high-level API that, eventually, bubbles through several layers until the notifier is triggered with a certain message. In that case, an efficient and effective test would be to test that the INotifier.Send() method was called with the expected parameters.

Even in the author’s example, there is presumably an external data store of some sort that is being mocked. I’m not actually interested in testing whether that data store interprets my command to store correctly. I’m going to assume that it does because it’s not my code..[1] What I want to confirm is that I sent the command to the store. That is, I want to verify that a particular method was called with particular parameters. Perhaps I’ll use a snapshot test to verify that the generated SQL is correct. Then I don’t have to actually run the SQL against the database every time.

In the author’s case, he’s calling a method on one interface and verifying that a property of another interface has changed. He is testing the interplay of those two components. That he used test-double doesn’t help at all—it’s because the test-double was written correctly that the test means anything. And there are no tests to verify that the test-double actually does what he assumes it is doing.

While I agree that test-doubles have their place, I think that mocking frameworks can also be very helpful. That’s why I don’t like rules like “test behavior not implementation”. I prefer to consider it a guideline, so that I can remember to write high-level, well-abstracted tests where possible but I can also just test that a certain method on a certain component will be executed.

[1] If that promise is broken, then I will have to reevaluate. I could write a test to verify that the external component works as expected—just in case it breaks again—I could find a more reliable external component, I could fix the current external component, or some combination of these..↩