Mock Objects

Hamcrest 1.0

Surfing on ham

An unexpectedly popular feature of jMock has been its library of Constraints. A constraint is a very simple object: it returns a boolean indicating if it matches another object and can describe itself. People have found this very useful. As well as extending JUnit's assert method, jMock constraints have been used for validating user input, filtering streams and collections and describing validity rules for incoming messages. However, people have felt uncomfortable using them in production code because they are part of a testing library.

Well, they need worry no more. Joe Walnes has pulled the constraints out of jMock into their own library called Hamcrest. Along the way he has renamed them to Matchers — "Hamcrest" is an anagram of "matchers" — and made them work with Java 5's generic type madness.

Hamcrest is split into two layers. A small, stable core API that defines the Matcher interface and abstract base class itself and a large library of matcher implementations. The intention is that the code API will change slowly to provide a stable base for other projects but the library will grow quickly as users contribute new matchers. Users will be able to drop new library releases into their codebase to extend the matchers available without breaking any libraries that are themselves dependent on Hamcrest's core.

The release of Hamcrest 1.0 should enable better integration between projects that provide or need matching. For example jMock 2 will use Hamcrest matchers instead of its own constraints and the Get In Line text pattern matcher is also a Hamcrest matcher. This means that jMock 2 users will be able use the Get In Line embedded domain-specific language to define text patterns in their expectations.

Posted on December 20, 2006 [ Permalink | Comments (2) ]

Testing Multithreaded Code with Mock Objects

knot.jpg

A question that is frequently asked on the jMock users' mailing list is "how do I use mock objects to test a class that spawns new threads"? The problem is that jMock appears to ignore unexpected calls if they are made on a different thread than that which runs the test itself. The mock object actually does throw an AssertionFailedError when it receives an unexpected call, but the error terminates the concurrent thread instead of the test runner's thread.

Here's a far-fetched example. We have a guard and an alarm. When the guard gets bored, he shouldn't ring the alarm just for kicks.

public interface Alarm {
	void ring();
}

public class testGuardDoesNotRingTheAlarmWhenHeGetsBored() {
	Mock mockAlarm = mock(Alarm.class);
	Guard guard = new Guard( (Alarm)alarm.proxy() );
	
	guard.getBored();
}

Here's an implementation of Guard that should fail the test:

public class Guard {
	private Alarm alarm;
	
	public Guard( Alarm alarm ) {
		this.alarm = alarm;
	}
	
	public void getBored() {
		startRingingTheAlarm();
	}
	
	private void startRingingTheAlarm() {
		Runnable ringAlarmTask = new Runnable() {
			public void run() {
				alarm.ring();
			}
		};
		
		Thread ringAlarmThread = new Thread(ringAlarmTask);
		ringAlarmThread.start();
	}
}

However, the test will pass because the mock Alarm will throw an AssertionFailedError on the ringAlarmThread, not the test runner's thread.

The root of the problem is trying to use mock objects for integration testing. Mock objects are used for unit testing in the traditional sense: to test units in isolation from other parts of the system. Threads, however, by their very nature, require some kind of integration test. Concurrency and synchronisation are system-wide concerns and code that creates threads must make use of operating system facilities to do so.

A solution is to separate the object that needs to run tasks from the details of how tasks are run and define an interface between the two. We can test the object that needs to run tasks by mocking the task runner, and test the implementation of the task runner in integration tests.

In our running example, we can introduce a TaskRunner interface to which the Guard passes tasks that it wants to run instead of explicitly creating new threads.

public interface TaskRunner {
	void start( Runnable task );
}

Our test then looks like:

public class testGuardDoesNotRingTheAlarmWhenHeGetsBored() {
	Mock mockAlarm = mock(Alarm.class);
	TaskRunner taskRunner = ... // What goes here?
	Guard guard = new Guard( (Alarm)alarm.proxy(), taskRunner );
	
	guard.getBored();
}

But how should we implement the TaskRunner that we use in our test? If we pass in a TaskRunner that creates a new thread we'll be back where we started and our tests will still, wrongly, appear to pass. We need to run the task in the same thread as the test runner. The easiest way to do that is to run the task immediately it is started, without spawning a thread at all. To do this we could mock the TaskRunner interface and use a custom stub to call back to the task's run method, but that's over-complicating things. It's much easier just to write a class that implements the interface:

public class ImmediateTaskRunner implements TaskRunner {
	public void start( Runnable task ) {
		task.run();
	}
}

Our test then looks like:

public class testGuardDoesNotRingTheAlarmWhenHeGetsBored() {
	Mock mockAlarm = mock(Alarm.class);
	TaskRunner taskRunner = new ImmediateTaskRunner();
	Guard guard = new Guard( (Alarm)alarm.proxy(), taskRunner );
	
	guard.getBored();
}

And the implementation of the Guard and the task runner it uses look like this:

public class Guard {
	private Alarm alarm;
	private TaskRunner taskRunner;
	
	public Guard( Alarm alarm, TaskRunner taskRunner ) {
		this.alarm = alarm;
		this.taskRunner = taskRunner;
	}
	
	public void getBored() {
		startRingingTheAlarm();
	}
	
	private void startRingingTheAlarm() {
		Runnable ringAlarmTask = new Runnable() {
			public void run() {
				alarm.ring();
			}
		};
		
		taskRunner.start(ringAlarmTask);
	}
}

public class ConcurrentTaskRunner implements TaskRunner {
	public void start( Runnable task ) {
		(new Thread(task)).start();
	}
}

Another solution for unit testing would be to run the task in the test runner's thread after the call to guard.getBored() has finished. This might be useful if the Guard class contains try...finally statements that mask test failures caused by the task. Again, we can create a TaskRunner implementation to do this:

public class DelayedTaskRunner implements TaskRunner {
	private List delayedTasks = new ArrayList();

	public void start( Runnable task ) {
		delayedTasks.add(task);
	}
	
	public void runTasks() {
		for (Iterator i = delayedTasks.iterator(); i.hasNext(); ) {
			((Runnable)i.next()).run();
			i.remove();
		}
	}
}

public void testGuardDoesNotRingTheAlarmWhenHeGetsBored
	Mock mockAlarm = mock(Alarm.class);
	DelayedTaskRunner taskRunner = new DelayedTaskRunner();
	Guard guard = new Guard( (Alarm)alarm.proxy(), taskRunner );
	
	guard.getBored();
	taskRunner.runTasks();	
}

Pulling the mechanism for running tasks out of the object that needs tasks to be run can have other benefits beyond easier unit testing. One of the effects that Tim Mackinnon discovered on introducing mock objects into a project was that being forced to test classes in isolation creates "flex points" in the code that, spookily, are exactly where you need them as you evolve the codebase. For example, it would now be trivial to make our Guards use a shared thread pool instead of a ConcurrentTaskRunner.

Update: Doug Lea's concurrency library, which is now part of the Java 1.5 standard library, provides an Executor interface and various implementations.

Posted on October 14, 2004 [ Permalink | Comments (5) ]

Refactor to Delegation in preference to Abstract Classes

Minimize, eliminate, delegate, and routinize. Decide what’s important and forget the rest. — Donna N. Douglass.

I've noticed a few times that factoring the difference between subclasses into explicit delegation results in cleaner code that is easier to test, especially when I use mock objects. Perhaps this is a useful Mock Object testing pattern: Replace Duplicated Behaviour with Delegation Through an Interface.

I recently refactored some of the jMock code to hook customisable formatting into the code that generated error messages on test failure. The end result is that InvocationMocker objects — objects that represent expectations or stubs — delegate to a Describer interface when asked to give their description and the concrete describer can be set when the object is constructed. In this way, the objects of the generic framework can be configured to create error messages that reflect the high level API used to compose them.

As part of the refactoring, I found a subclass of InvocationMocker that existed just to override the default describe implementation (to give no description, as it happens). I replaced instantiations of this class with instantiations of the base class with a custom Describer. This is a much cleaner design and much easier to test. The delegation of the descriptions to a Describer object is very easy to unit test with mock objects; the existing implementation and the overridden version were only tested in the acceptance tests and not actually unit tested at all.

If we had originally factored the differences between base class and subclass into a separate object, and defined the interaction with that object with an interface, the Describer design would have been already implemented by the time we needed it. We would also have been better able to test our classes. We should have listened to our tests and pulled out that interface when we originally wrote the subclass of InvocationMocker.

Update: Ivan Moore has posted a good article about refactoring inheritance into composition and delegation.

Posted on February 23, 2004 [ Permalink ]

Using Colourful Language

Color Palettes

I realised recently that the dynamic mock object frameworks that I have worked on over the last few years are really an attempt to create embedded domain specific languages for specifying the expected outgoing calls of an object under test. The latest version of the dynamic mock object framework is jMock, written for the Java language. Java does not support embedded languages very well because of all the syntax it requires: round brackets here, curly braces there, square brackets somewhere else, type declarations and casts willy nilly. They all get in the way of what I want to write and read: the identifiers I have defined and literal values. It turns out that fiddling with the syntax colouring in Eclipse helps a great deal. Although the colouring rules are not fine grained enough to be perfect, slight changes in colour scheme have made my tests easier to read.

I would contend that the art of programming is the creation of languages in which you can express your solutions to problems in the application domain — that is, the creation of domain specific languages. Some programming languages, such as Forth, Haskell, LISP or Smalltalk, make it very easy to create domain specific languages because they provide the programmer with less instead of more. Each of those three languages has very little syntax, provides very few core abstractions and control structures, but gives the programmer powerful ways to combine existing abstractions into new abstractions. Java takes the opposite approach: it provides quite a few core abstractions with lots of syntax and limits the way that the abstractions can be smoothly combined. But we all know that Java is not the best language in the world. We use it for practical reasons and because the excellent tool support makes us very productive. So I tried a quick experiment to see if I could make my IDE of choice, Eclipse, support my use of a domain specific language.

My experiment involved fiddling with the syntax colouring to see if different colour schemes could make my mock object tests easier to read. Here's a test case displayed with the usual syntax colouring. The identifiers and values get a bit lost among all the brackets:

mock-code-black.gif

If I make the brackets the same colour as the background, the lines that set up expectations are now very descriptive, but the colour scheme is totally impractical for anything else. Furthermore, Eclipse gives the same colours to brackets as to commas, all operators and even decimal points!

mock-code-black.gif

But using 50% grey works nicely. Java's syntactic noise is still visible but now the identifiers and values stand out:

mock-code-black.gif

Posted on February 16, 2004 [ Permalink | TrackBack (1) ]

A serial killer in the space-time continuum

serial_parallel.jpg

I had to simulate how weather affected the the race cars in the motor racing simulation I wrote about earlier. Weather is randomly chosen: there is a probability of rain for each race, each rain shower and dry period has a random duration, the rate of rainfall is randomised, there is a random temperature that affects tyre wear and the rate at which the track dries off, and so on.

I started with a Weather object that had a probability of rain and a random temperature, within some bounds. The Weather object randomised itself by reading random numbers between 0 and 1 from a Random object. This was easy to test: I used a mock random number generator that expected a sequence of two calls to nextDouble, the first of which was used to decide if it was raining and the second to calculate temperature.

I then extended the class to randomise the heaviness of rainfall. This broke my existing tests. Obviously, where the Weather object had asked for two random numbers it was now asking for three random numbers. Easy to fix, but my tests were still failing! Why?

It turned out that while adding random heaviness I had reordered the randomisation of the temperature and probability of rain to make the code easier to read. Lesson 1: I should not have been refactoring while adding functionality. But the goof made me realise something much more interesting. The true problem was not that I had mixed refactoring and implementation, but that the Weather object was making a sequence of calls to the same method of another object without there actually being a true temporal relationship between those calls. Each random number returned by the generator was, as far as the Weather object was concerned, completely independent of any other — that's the very definition of random, after all.

So, I changed the Weather object to hold multiple references to random number generators, one for each aspect of its state that it wanted to randomise. In tests it was passed multiple mock random number generators, so I could control each aspect of its state. In the production system it was passed multiple references to the same random number generator through a convenient constructor.

class Weather {
    public Weather( Random temperatureRandom, 
                    Random isRainingRandom, 
                    Random rainfallRandom )
    {
        ...
    }

    public Weather( Random randomness ) {
        this( randomness, randomness, randomness );
    }
}

But that gave me the uncomfortable feeling between my shoulder-blades that I always get when looking at bad code. All those Random parameters were an obvious code smell: too many parameters indicate that a new object should be factored out. And I could see that every time I added functionality to the Weather class I would need to add more random number generators, making the first constructor signature really awkward and, worse, I would break all existing code that uses that constructor.

So, I pulled the random number generators into a Weather-specific source of randomness with an abstract interface:

interface WeatherRandom {
    double nextTemperature();
    double nextChanceOfRain();
    double nextRainfall();
}

Now I could easily mock that interface in my tests and use a simple adapter in the running system that delegated to a Random object. This had the advantage that I when added functionality to the Weather class I only needed to add methods to the WeatherRandom interface, but didn't need to change any test code that tests existing random behaviour.

It is often the case that, as in this example, one object will make multiple calls to a single method of another. Such a design can lead to brittle tests if there is no true temporal relationship between those calls. That is, if the order of outgoing calls is not important to the functionality of the object under test, the tests should not place constraints on that order. If they do, changes to the object's internals might break unrelated tests. The solution is to refactor a sequence of calls to the same method into calls to different methods that can occur in any order.

One way to view these refactorings is that I replaced a serial interface with a parallel interface between the Weather and object and random number generator. Another viewpoint is that I used a spatial construct to describe a temporal aspect of my program's behaviour. The relationship between time and space in programming is something that James Coplien has written much about [1,2,3].

Posted on November 25, 2003 [ Permalink ]

The value of factories

I will not reason and compare: my business is to create. — William Blake

At Geek Night Joe Walnes, Steve Freeman and myself tried to come up with an example to use in live demos of using test-first programming with mock objects to drive iterative, top-down design. Our example was a fruit shopping application where the user could ask a directory to find the shop with a fruit at the lowest price, the nearest shop with a fruit at an affordable price, and so on, and then buy the fruit from the shop that was found. Our demo would be to pair-program the search functionality in front of a live audience.

Our design evolved well. The ShopFinder would search a list of shops by calling through an interface, ShopList. The ShopFinder would search the ShopList using an Internal Iterator as familiar to users of Ruby, Smalltalk or functional languages: the ShopFinder would create a "collector" object and pass it to the forEach method of the ShopList, which would call back to the collector and allow it to ask each shop for an offer on the fruit the user wanted.

I was pretty happy with this design. It follows the "Tell, Don't Ask" principle, which makes testing easy with mock objects. However, it required us to introduce an advanced feature of the jMock API too early in the demo — specifically, how to mock the side effect of the forEach method of the ShopList. So we abandoned that approach and instead tried others that were not very good, eventually giving up due to post-workday tiredness and general brain fade.

Our mistake was to hide the creation of a collector object within the ShopFinder that used it. This meant that we couldn't intercept the creation of the collector and replace it with a mock. By replacing a real collector with a mock we could have tested that the ShopFinder passed it to the ShopList by setting expectations on the ShopList and returned data to the ShopFinder by stubbing the collector instead of making the mock ShopList call back to the collector it received.

How could we have replaced a real collector with a mock? By giving the ShopFinder a factory with which it could create collectors and mocking the factory to return the mock collector(s) we needed in our tests. Even ignoring the needs of our demo, this change would have made our tests more readable. The factory would also act as a "flex point" which, experience has taught me, will make it easier to evolve the code in the future.

But it is very common for one object to create another. So common that one cannot practically replace every object instantiation with a call to a factory. Which object instantiations should be replaced by factories?

My initial conclusion is that value objects can be instantiated directly without causing problems because a test doesn't care about the value of a reference to a value object: the identity of a value object is defined by its state, not its reference. Neither does a test need to mock a value object because it doesn't care how it changes over time; after all value objects should be immutable. Behavioural objects (a.k.a. reference objects) are different: a test case need to compare references to test that the object is passed around correctly, and a test needs to mock the object in order to test that it is modified correctly while being passed around. So direct instantiations of behavioural objects should be replaced by the use of factory.

I have no idea whether this is useful, general design rule. And I don't really care! If my code needs a factory, the TDD process with mock objects soon drives that factory into existence — ignoring the need for a factory makes testing too painful otherwise. If my code doesn't need a factory the issue doesn't come up. TDD seems like magic sometimes; it's rather scary.

Posted on November 24, 2003 [ Permalink ]