Design and Architecture

Refactoring Interfaces

Many types of DC power plug

In our 2004 OOPSLA paper, Mock Roles not Objects, Steve, Joe, Tim and I described how we used Mock Objects and TDD to guide the design of object-oriented software. Briefly, we described the process as:

  • Start by writing a unit test for an object's behaviour
  • In the test, mock out interfaces for the services that you find the object requires
  • Define expectations on mock objects to specify how the object communicates with those services, and how those services must behave in response
  • When you have made the object pass the test, apply the same process to write the objects that provide those services it needs.

Unfortunately we neglected to describe a vital part of the process: refactoring. As the system grows we look out for interfaces that define similar ways in which objects collaborate — common patterns of communication between objects. We then collapse different interfaces that are incompatible but semantically equivalent into the same type.

We took for granted that when programmers refactor a system they apply as much refactoring effort to the interfaces between the objects as they do to the classes of the objects themselves. However, I have found that this is not the case. For example, Martin Fowler's canonical book of Refactoring patterns does not contain any patterns about refactoring interfaces or the communication protocols between objects.

If you follow the interface discovery process without refactoring the interfaces you discover, you end up with a system containing lots of interfaces, many of which represent very similar concepts in incompatible ways. Objects that should be plug-compatible cannot communicate without lots of awkward little adapter classes. As a result, I have found that teams develop a negative reaction to interfaces or even object-oriented design and end up with a design that is difficult to change because its classes are statically coupled.

When you refactor interfaces into a set of common communication patterns, objects in the system become much more "pluggable". You can then change the behaviour of the system by changing the composition of its objects—adding and removing instances, and plugging different combinations together—rather than writing procedural code. The code that composes objects acts a declarative definition of the how entire system will behave. You therefore end up working at a much greater level of abstraction and can focus on what you want the system to do, and not how that is implemented.

Obviously you need to strike a balance. If you end up with interfaces like the following, you've gone too far!

public interface Thing {
    Object doSomething(Object arg) throws Exception;
}

Two refactoring steps I often apply are:

  • Collapse Interfaces: if two interface definitions are semantically equivalent but incompatible, replace both interfaces with one interface that represents their common semantics.
  • Distinguish Interfaces: define different interfaces so that code that tries to assemble invalid object graphs will not compile.
Posted on October 3, 2008 [ Permalink | Comments ]

Exception Handling in Distributed Systems II: Coming Back from the Dead

Nosferatu

Here's yet more blather about exception handling. The last one, I promise! (For now...)

My last post describes how I like to coordinate exception handling inside the components of a distributed system. A component handles the failure of other remote services that it uses — database servers, for example — by sending back appropriate error responses or rolling back distributed transactions. That's all well and good when a component runs within an application server, because the server handles all the messy details of fail-over and reconnection for you. But what about "main apps": plain old Java programs that run from a main method?

It can take a lot of code to correctly handle connection failure and reconnection, exponentially back-off connection attempts, clean up long-lived objects that hold onto connections, and hide all the messy technical details away from the business logic behind domain-term interfaces. It's also hard to get all the corner cases right.

It's much easier to just not bother with reconnection at all.

When a main app catches an EnvironmentException it should roll back transactions, send back response codes, or whatever it needs to do and then, instead of trying to reconnect, just die. Launch the app from a supervisor process that restarts it whenever it dies. The Java Service Wrapper does the job very nicely.

Now you don't need to write any reconnection logic at all. The application's start-up code is enough.

This greatly simplifies writing distributed Java apps. And a simple system is more reliable, easier to secure and easier to change.

Posted on July 15, 2008 [ Permalink | Comments ]

Exception Handling in Distributed Systems

Crashed Arrivals Screen in an Airport. The screen has crashed; it's not a screen of arrivals that have crashed!

I'm on a bit of a roll when it comes to exception handling tips, so here's another technique that's worked well in the last few systems I've had a hand in, this time for coordinating exception handling within a component of a distributed system.

A component in a distributed system receives requests from remote components and reacts by making requests of its own to remote components in its environment that it depends on, before sending back a response. However, because its clients and dependencies are out of its control, it cannot guarantee that the requests it receives are correct or that the services it depends on are available when it must service a request. By "request" I mean either a client-server style request/response interaction or an asynchronous event received from a message broker.

In Java terms, because both bad requests and failed dependencies are out of the component's control, they should be reported by throwing checked exceptions. However, the way they should be handled is significantly different. A bad request should never be retried, but could be logged for manual repair and replay if that makes sense for the system. The failure of a dependency is (hopefully) temporary, and so the request can be retried later.

Apart from RMI, Java frameworks for writing distributed systems don't make this distinction in the exceptions they throw or handle. Therefore, when building distributed systems, among the first things I write are two base exception classes: BadRequestException and EnvironmentException. Depending on the communication protocol, the application will handle these in different ways:

Bad Request Exception Environment Exception
HTTP (e.g. Servlets) Return a 4xx response code Return a 5xx response code
JMS (e.g. Message Driven Beans) Move message to a hospital queue and commit the transaction. Roll back the transaction, leaving the message on the input queue for later redelivery

Because frameworks don't make the distinction between Bad Requests and Environment Exceptions, I keep the framework code — servlet or MDB class, for example — as thin as possible, doing little more than delegating the request to an interface that throws BadRequestException or EnvironmentException and handling each kind of error as appropriate.

The client-side code of an synchronous remote call needs to translate the status it receives appropriately. If the returned status indicates that the client made a bad request (e.g. an HTTP 4xx code), it should throw a RuntimeException to indicate that a programming error has been detected. If the status indicates an environment exception (e.g. an HTTP 5xx code), it should throw a checked exception so the compiler ensures that the exception is handled. I usually wrap that logic up in a convenient proxy object.

Photo by Ian Hughes used under the Creative Commons Attribution license.

Posted on July 8, 2008 [ Permalink | Comments ]

Throw Defect

Defects Sign

While I'm on the subject of exceptions, let me share another exception idiom I've used a lot on recent projects... the Defect exception.

The Java compiler is very strict. It will complain about missing code paths that you, the programmer, know should never be followed at runtime. This is especially common when using APIs that throw checked exceptions to process data that will always be correct unless the programmer has screwed up..

For example, loading and parsing template from a resource that is compiled into the application should never fail, but throws a checked IOException.

Template template;
try {
    template = new Template(getClass().getResource("data-that-is-compiled-into-the-app.xml"));
}
catch (IOException e) {
    // should never happen
}
...

If an IOException is caught, something is seriously wrong with the application itself, because of something the programmer has done. The application has been built incorrectly, perhaps, or the template is syntactically incorrect. If the application continues running it will fail with confusing errors later on. The error will be difficult to diagnose.

So, the catch block must throw another exception. It can't throw a checked exception. Because it has caught a programming error, it should throw some kind of RuntimeException. RuntimeException itself is a bit vague, and there isn't a subclass of RuntimeException that really fits the bill for this situation. Therefore, I like to define a new exception type to report programmer errors. I first called it StupidProgrammerException, but now, as suggested by Romilly, I call it by the less confrontational name of Defect:

public class Defect extends RuntimeException {
    public Defect(String message) {
        super(message);
    }

    public Defect(String message, Throwable cause) {
        super(message, cause);
    }
}

When the compiler asks me to write code paths that should never happen, I throw a Defect. For example:

Template template;
try {
    template = new Template(getClass().getResource("data-that-is-compiled-into-the-app.xml"));
}
catch (IOException e) {
    throw new Defect("could not load template", e);
}
...

Image by Jonas B, distributed under the Creative Commons Attribution license.

Posted on June 26, 2008 [ Permalink | Comments ]

Generic Throws: Another Little Java Idiom

throw.jpg

A seemingly little known fact about Java generics is that you can write generic throws declarations by declaring a type parameter that extends Exception. For example, the following interface defines a generic finder that looks up a value of type T, returns null if it is not found,or may report that the lookup failed completely by throwing an exception of type X:

public interface Finder<T, X extends Exception> {
    @Nullable
    T find(String criteria) throws X;
}

Different implementations can fail in different ways. A finder that performed an HTTP query can fail with an IOException. A finder that queries a database can fail with a SQLException. And so on.

A Cunning Idiom for Classes That Cannot Fail

But what about queries that cannot fail? You might want to implement the query in-memory, with a HashMap or something. You don't want to declare that find throws a checked exception. Therefore bind X to RuntimeException. The compiler will ignore the throws clause and you don't even need to include it in implementing classes:

public class InMemoryFinder<T> extend Finder<T, RuntimeException> {
    private final Map<String,T> entries;

    public InMemoryFinder(Map<String,T> entries) {
        this.entries = entries;
    }

    @Nullable
    T find(String criteria) { // No need for a throws clause
        return entries.get(criteria);
    }
}

Generic throws can remove a lot of boilerplate code to pass checked exceptions through interfaces by wrapping them in more abstract checked exceptions. Hopefully one day Java will provide anchored exceptions so we can avoid all this generics jiggery-pokery.

Generic Throws and Polymorphism

Wrapped exceptions are still necessary when an interface must be used polymorphically. A type that declares generic throws is generic but not polymorphic: an interface T<IOException> cannot be used wherever a T<SQLException> is acceptable.

If necessary, an interface with generic throws can be converted by an Adaptor into a version that throws wrapped exceptions and can be used polymorphically.

public class PolymorphicFinder<T,X> extend Finder<T, FinderException> {
    private final Finder<T,X> implementation;

    public PolymorphicFinder(Finder<T,X> implementation) {
        this.implementation = implementation;
    }

    @Nullable
    T find(String criteria) throws FinderException {
        try {
            return implementation.find(criteria);
        }
        catch (X x) {
            throw new FinderException("query failed", x);
        }
    }
}

Wishful Thinking

Imagine if the Iterable and Iterator interfaces were parameterised by exception type:

public interface Iterator<T, X extends Exception> {
    boolean hasNext() throws X;
    T next() throws X;
}

public interface Iterable<T, X extends Exception> {
    Iterator<T,X> iterator();
}

Collections could implement Iterable<T,RuntimeException> and appear no different from the way they are now. However, I/O streams, SQL result sets and other streams of data read from the program's environment could be represented as Iterable objects and the for-each loop could be used to iterate over their contents.

For example, BufferedReader could implement Iterable<String,IOException>, which would let you write:

try {
    BufferedReader reader = new BufferedReader(...);
    for (String line : reader) {
        ... do something with the line
    }
}
catch (IOException e) {
    ... iteration failed
}

Unfortunately, it's probably too late to make this change because it would break backward compatibility.

Photo by Wildcat Dunny, distributed under a Creative Commons license.

Posted on June 25, 2008 [ Permalink | Comments ]

Evolving an API: Programmers are People Too

By popular (e.g. one) demand, here are the notes to the lightning talk I gave at the last Google Open-Source Jam on my experiences in API design.

A User Interface for Programmers

An API is a user interface for programmers. That means you have to develop it like a user-interface: iteratively, adapting to how the users actually use and misuse the interface in real life. In an open-source project we can't afford to run usability experiments, hiding behind one-way mirrors observing how users work with our software. Instead we have to react to explicit user feedback, bug reports and feature requests on our project's mailing lists, forums and issue trackers. Because programmers are passive aggressive and love to moan about things, they'll post negative feedback on their blogs instead of discussing it on the mailing lists. Google Blog Search is an excellent tool for finding out what programmers really think about your API.

But... programmers focus on solutions, not goals, so their feature requests will be worded in terms of how they think they would change the API to support whatever it is they want to do. You have to engage in a lot of arduous back-and-forth to find out what they are trying to achieve, rather than what they think they would change in your code. Often you (and they) discover that their goal can be achieved with the current API but they haven't noticed because of their solution myopia. If it turns out your API does not do what they want, you will be better placed to work out the correct way of extending the API or push back on the request because it would violate the API's conceptual integrity.

Conceptual Guide

An API encapsulates a conceptual model for representing and solving problems in the API's application domain. It guides programmers towards what the API designer considers to be the "right" way of doing things and away from the "wrong" way.

But... although programmers know there are no silver bullets, they love their golden hammers! They've learned your API and now they want to apply it to problems that are unrelated or only vaguely related. They'll submit bug reports, feature requests and patches aplenty as they encounter difficulties. To steal a colleague's joke: "if all you have is a hammer, everything looks like a thumb". You have to be willing to push back on requests that would violate the conceptual integrity of your API, otherwise it turns into an inconsistent mess that cannot communicate any useful understanding about the problem domain. It seems a strange idea to some programmers, but just because they use your API doesn't mean that they cannot not use another API at the same time.

System of Names

A clear and consistent system of names help people understand the conceptual model. For an API, the names should be chosen to make the calling code as readable as possible, even if it makes the implementation of the API less readable. It's really useful to discuss naming ideas on the mailing lists before adding new features to find out if what you think is easy to understand actually confuses people.

But... people often don't take notice of names or understand their meaning, especially when they speak a different language from the API implementors. I wrote an example class in Java that was called UnsafeHackConcreteClassImposteriser. I'd have thought that the phrase "unsafe hack" in the name was enough to warn programmers away from using it as anything more than an example, but apparently not. Someone still used it in their project and complained that it didn't work exactly as they wanted.

Extensibility

Users will always want to adapt your (object-oriented) API to their needs.  If you don't define clear plug-in points they will adapt your API by inheriting your classes and overriding internal methods.  That is brittle: it will increase your support overhead or reduce your ability to change the code. Define plug-in interfaces to be focused on one thing, so that they remain stable. For example, the Hamcrest Matcher interface has only two methods and has not significantly changed since it was first defined over 8 years ago. To ensure that code that uses the API is as readable as possible, plug-in points should be seamless: code that plugs user defined objects into the API should not look different than code that uses the built-in objects.

But... if you provide extension points, programmers will want you to maintain their extensions for them. Your users will contribute loads obscure extensions that make sense in their projects but don't have a wide applicability. If you adopt every contributed extension you will end up with an enormous maintenance overhead. Therefore, find out which are the most popular plug-in points and keenest users. Spin off libraries of extensions as external projects and get your keen users to maintain them. E.g. the Hamcrest matchers used to be part of the jMock project but now they're maintained by Neil and Joe. Thanks chaps!

Diagnostics

Error messages are part of the API. Take notice of the errors that confuse users the most (using the mailing lists, issue tracker and blog searches) and improve error messages to reduce confusion. This will reduce the support load in the long term.

But... there is a tension between error diagnostics and generalisation. The more generic and pattern-tastic is your implementation, the harder it is to determine what the user was trying to do that caused an error, and so the harder it is to generate error messages that make sense to the user. Sometimes you have to sacrifice generality for better error messages.

Examples

To save time, make documentation extensible so that you can grow it piecemeal without it obviously being under construction. It's easiest to generate reference documentation with Javadoc.

But... programmers don't always want reference documentation. Javadocs are not useful for people trying to learn an API. Programmers want canned solutions and concrete examples that they can copy into their project and adapt to their needs. The best form of documentation for both you the implementor and your users is HOWTO documents, cookbook recipes and FAQs. The more flexible and extensible the API, the more useful recipes and HOWTOs are because applying the API may require some lateral thinking to understand how the features of the API are used to solve problems.

Posted on February 29, 2008 [ Permalink | Comments ]

Designing an API to work with the IDE

In a recent blog post, Lou Franco describes how Microsoft's LINQ syntax is designed so that the Visual Studio IDE can easily autocomplete identifiers as the programmer writes queries. He asks:

The idea of language features being designed for the IDE leads to the question of whether API's should be designed for the IDE as well.

I think they should. Steve and I designed the jMock API to work well with IDEs. The chained method "Embedded DSL" API of jMock 1 was designed so that the IDE's autocompletion would act as a "wizard", guiding the programmer through the process of creating complex expectations. However, using strings for method names did not work so well with refactoring IDEs.

The new jMock 2 API works well with refactoring IDEs, but does not have as many chained methods and so is not quite as good at using autocompletion as an expectation wizard. It can also confuse some automatic code formatting tools. However, you can autocomplete on the names of mocked methods, which was not possible in jMock 1.

Like any design, an API must trade different constraints off against one another. Overall, I think, the jMock 2 API is an improvement and works better with features that are common in Java IDEs, if not yet in Microsoft's tools.

Posted on November 9, 2007 [ Permalink | Comments ]

Easy Java Bean Event Notification

Unlike C#, Java has no language support for event notification. Instead, Java classes usually follow a set of programming conventions, defined by the Java Beans framework, for defining notifications in terms of listener interfaces and methods for connecting and disconnecting event listeners and event sources.

In the Java Bean conventions, notifications are typed and grouped by "listener" interfaces. Each listener interface extends EventListener and defines a methods for each notification that is delivered through that interface. For example, the ContainerListener interface defines two notifications: componentAdded, announced when a component is added to the container, and componentRemoved, announced when a component is removed. Listener methods do not return results.

An object is declared as source of events of some type by implementing two methods, one to add a new listener to the event source and one to remove a listener from the event source. For example, a source of Sheep events would define the two methods:

  • void addSheepListener(SheepListener l) { ... }
  • void removeSheepListener(SheepListener l) { ... }

An object that wants to receive notifications must implement the listener interface that defines the notifications and then be added as a listener to the source of events. The source of events must maintain a collection of registered listeners. To announce an event it must call the appropriate notification method of all of the listeners that have been registered with it.

As you can see, writing a class that announces Java Bean events is not an insignificant effort. What is a one-line declaration in C# requires a lot of boilerplate code in Java. The Java Bean event conventions and Java's static type system do not make it easy to write notification code in a way that can be reused in many classes for different listener types. As a result, notification is not used as often as it could be in Java code. It is usually easier to pass a callback interface to an object's constructor than it is to define Java Bean events on the class.

The result is that classes are more tightly coupled than necessary. You must pass a listener to the constructor of the source of events even when there is no need to consume the notifications.

Worse, it becomes impossible to create the object and the listeners at different times and then connect them at a later time. This is a form of what I call "Temporal Coupling". The objects appear to be decoupled because they communicate through interfaces but there is an implicit, hidden coupling between the time at which the two objects can do things. In this case, the listener must exist before you create the source of events.

Luckily I have a trick up my sleeve. I have written a cunning class, Announcer, that uses the reflection API and Java 5 generics to implement the Java Bean event model for any listener interface. Although it still requires more than one-line, it makes it much easier to add Bean events to any class. Here's how it's used to make a class a source of Sheep events:

public class Sheep {
    private Announcer<SheepListener> sheepListeners = Announcer.to(SheepListener.class);
    
    ...

    public void addSheepListener(SheepListener  listener) {
        sheepListeners.addListener(listener);
    }

    public void removeSheepListener(SheepListener listener) {
        sheepListeners.removeListener(listener);
    }

    protected void announceSheepDipped() {
        sheepListeners.announce().sheepDipped(this);
    }

    protected void announceSheepSheared() {
        sheepListeners.announce().sheepSheared(this);
    }

    ...
}

Currently the Announcer class is sitting as an example in the jMock project. It's only one class (and a unit test), so I don't think it's worth giving it its own project and JAR file. I usually just copy it into whatever project I'm writing and move it into one of the project's packages. Feel free to do the same.

Posted on August 13, 2007 [ Permalink | Comments ]

A Little Java 5 Idiom

In Java 5, an application's main function can be declared with varargs parameters instead of an array of Strings. For example:

public class Main {
    public static void main(String... args) {
        for (String arg : args) System.out.println(arg);
    }
}

Not a particularly astounding revelation: under the hood, varargs parameters are passed to the function as an array so the JVM sees no difference between a varargs main function and one declared to take a String array. However, it's easier to call a varargs main function from other Java code – in end-to-end tests for instance – and the resulting code is easier to read. For example:

private void runProgram() {
    Main.main("hello", "world");
}

instead of

private void runProgram() {
    Main.main(new String[]{"hello", "world"});
}

Because of this I've started declaring all my main functions with varargs parameters.

Posted on August 3, 2007 [ Permalink | Comments ]

Identifiers CAN Say Why

question-mark key

A while ago I wrote about the rule of thumb I use to decide when to write a comment: "Identifiers say what. Comments say why." Since then I've had the pleasure to work on a couple of projects with Ivan Moore and he's shown me that it is possible for identifiers to say "why" as well. His style is to use a few long and descriptive identifiers to explain why particularly odd or complex pieces of code exist in the system. I've started using the style as well and found it to be very useful. For example, here's some code from an imposteriser in jMock 2:

    private <T> Class<?> createProxyClass(Class<T> mockedType, Class<?>... ancilliaryTypes) {
        Enhancer enhancer = new Enhancer();
        enhancer.setClassLoader(mockedType.getClassLoader());
        if (mockedType.isInterface()) {
            enhancer.setSuperclass(Object.class);
            enhancer.setInterfaces(prepend(mockedType, ancilliaryTypes));
        }
        else {
            enhancer.setSuperclass(mockedType);
            enhancer.setInterfaces(ancilliaryTypes);
        }
        enhancer.setCallbackType(InvocationHandler.class);
        enhancer.setNamingPolicy(
            NAMING_POLICY_THAT_ALLOWS_IMPOSTERISATION_OF_CLASSES_IN_SIGNED_JARS);
        enhancer.setUseFactory(true);
        
        Class<?> proxyClass = enhancer.createClass();
        return proxyClass;
    }

It's pretty clear, I think, why that naming policy is being used.

This style works well for enterprise software. Computer software, being precise and logical, is never a good fit for the messy, illogical ways that groups of people work together1. Elegant code will need some clunky workaround for strange organisational decisions, and it's important to clearly explain in the code why those workarounds exist. Hopefully it will allow programmers to remove the kludge after the next round of reorgs reshuffles the management again.

  1. I'll rant about workflow systems another time!
Posted on April 24, 2007 [ Permalink | Comments ]

DOA (Drawbacks of Acronyms)

More acronyms than you can shake a STIC at.

Recently I've been writing a system that performs reverse geocoding. It reads input from a GPS device over a serial cable and translates that to a human readable identifier of the user's current location. That sounds simple but my pair and I had a great deal of difficulty coming up with good terminology in our domain model. What should we call the type of values that the GPSReceiver object parses from the device's data stream?

Location? That wasn't right because we were converting GPS data into location information. Locations are named regions, such as rooms, buildings, post codes, wards boroughs, or towns, not points.

For a while we called them GPSCoordinates. But passing around single values of type GPSCoordinates read poorly. The mix of singular and plural didn't work. We considered LatLon, but not for very long: too ugly. We thought about Point, but that was too easy to confuse with 2D points. GeographicalPoint didn't seem right either.

Eventually we came up with the ideal name: Position.

And then we realised: GPS stands for Global Positioning System. If we hadn't been using an acronym in the name of the GPSReceiver class the correct name would have been obvious.

I usually try to avoid using acronyms and abbreviations as identifiers, except for really common ones such as URL, HTTP and GPS. But this has shown me that even the most common acronyms can befuddle my thinking about the application domain and software design.

Posted on March 5, 2007 [ Permalink | Comments ]

A brief thought about exceptions in Java

My computer has a built-in coffee cup holder.

Sun should rename RuntimeException to StupidProgrammerException. That would clearly describe the kind of error that RuntimeExceptions are designed to report.

It would also give valuable feedback to authors of uber-frameworks whenever they start writing code like:

public class CheckedExceptionWrappedByUberFramework 
    extends StupidProgrammerException 
{
    ...
}
Posted on August 25, 2006 [ Permalink | Comments ]

Metaphor, Complexity and Postmodern Programming

St. Paul's Cathedral beyond the Millenium Bridge

A recent interview with architect Victoria Livschitz of Sun entitled Envisioning a New Language provoked some comment on the Postmodern Programming discussion group recently. The article describes "Metaphors", a new conceptual vision for a programming language and environment that will, in one stroke, eliminate all the tricky aspects of programming, letting developers concentrate purely on conceptual modelling and algorithms without worrying about hardware, networks, memory, existing systems and other pointless, technical stuff. From the interview:

I believe it's time to step back and consider the first 50 years of computing as one gigantic prototype. If we had the courage to start over, we could build a new software creation and execution model to address the software engineering challenges of the 21st century, such as distributed processing, autonomic computing, and software evolution -- natively, within one conceptual universe [my emphasis].

There's one fatal flaw with this plan: it's only a matter of time before some poor soul would have to integrate systems in Sun's "one conceptual universe" with those in Microsoft's "one conceptual universe" or the multiple, competing "one conceptual universes" that the open source and free software communities would create.

To quote Keith Braithwaite, the great thing about single conceptual universes, as with standards, is that there are so many to choose from.

For some time I bought in to Fred Brooks' notion that programming problems can be divided into "essential" — the inherent complexities of the application domain   and "accidental" — complexities produced as a by product of using the technologies themselves. I now think that's wrong; accidental complexity is the essential complexity of programming itself, of writing programs that must fit into an existing context of people, organisations, their needs and wants and the legacy of their history.

I think trying to get rid of accidental complexity is like harking back to a classical golden age that never really existed. It can be source of inspiration at best (think of St Pauls' neo-classical architecture) but dangerous at worst (think of Mussolini's attempt to recreate the Roman empire in Abyssinia). The attempt to create a single conceptual universe to rid the world of accidental complexity is doomed to failure. Wren could not convince Londoners to live within his grand vision of what London should be. Today St Pauls is situated within the old medieval street plan rebuilt after the fire of London and is within sight of many examples of later architectural movements, equally grand in conceptual scope yet totally incompatible with Wren's vision. That mix of styles is one reason why London is such an interesting place to be. The challenge of building to meet people's conflicting needs and wants while preserving the legacy of the past is what makes architecture so fascinating. I think it's the same for software.


Thanks to Keith Braithwaite, Steve Freeman and Martin Fowler for inspiration, bon mots and encouraging me to get this off my chest.

Posted on December 8, 2005 [ Permalink | Comments ]

Concealed Single-Letter Variable Names

The Trojan Horse
Timeo Danaos et data ferentes

While working on some old code of mine I was disappointed to find the following:

theta += omega * duration;
position = new Point(cos(theta), sin(theta));

Theta? Omega? What on earth was I thinking? Those are single-letter variable names. Single greek letter variable names. Spelling them out longhand doesn't change the fact.

I've seen the same pattern in other programs as well. Do we programmers suffer from terminal math envy? Do we think that only by writing unreadable tangles of greek symbols can we demonstrate our intellectual prowess? Even though we use the roman alphabet?

I wouldn't write code like this:

p += u*t + 0.5*a*t*t;

And writing it like this doesn't make it any better:

pea += ewe*tea + 0.5*eh*tea*tea;

This describes what is really being calculated:

position += velocity*duration + 0.5*acceleration*duration*duration;

And the first example is better as:

rotation += angularVelocity * duration;
position = new Point(cos(rotation), sin(rotation));
Posted on October 28, 2005 [ Permalink | Comments ]

Refactoring Higher Order Messaging

Delivery of mail by motorbike

In my experiments with higher order messaging I implemented convenient methods for selecting elements of a collection by queries sent to those elements: "where", "unless" and "having". These are equivalent to various uses of the Enunmerable#select method but, I think, much more readable.

While working on the Ruby code that used higher order messages I found I needed an equivalent to the Enumerable#any? and Enumerable#all? methods, which query if a collection contains any or all elements that match a predicate. Using higher order messages, I wanted to write statements like those below.

if claimants.all_where.retired? then
   ...
end

if claimants.any_having.benefits > 0 then
    ...
end

It was possible to write those higher order messages, of course, but only with a lot of duplicated logic. I got around this by splitting the higher order message objects into two: one object collects of results from individual elements, the other collates those results into a single total. The collector objects are higher order messages that define predicates over elements. They are common between all collators. The collators define whether those predicates are used to select a subset of the collection, or are combined by logical conjunction (all) or logical disjunction (any). Collectors are created by collators, like so

retired = claimants.select.where.retired?

if claimants.any.having.benefits > 0 then
    ...
end

The only problem now was that the select method is already defined by the Enumerable mix-in. I needed a new name. My solution was to change the naming convention of the higher order messages and as a by product allow a more natural, declarative expression of what was being calculated. In the new style, the example above looks like:

retired = claimants.that.are.retired?

if claimants.any.have.benefits > 0 then
    ...
end

Very expressive, but perhaps a little too whimsical. Only time will tell.

With this refactoring, it's very easy to add new collators. The higher order messages are all defined in a base class and new collators can be created in only a few lines of code:

class Collator
  def initialize(receiver)
    @receiver = receiver
  end
  
  def are
    return HOM::Are.new(self)
  end
    
  def are_not
    return HOM::AreNot.new(self)
  end
  
  ...    
end
  
class That < Collator
  def apply(&block)
    return @receiver.select(&block)
  end
end
  
class All < Collator
  def apply(&block)
    return @receiver.all?(&block)
  end
end
  
class Any < Collator
  def apply(&block)
    return @receiver.any?(&block)
  end
end

class Are < HigherOrderMessage
  def method_missing(id, *args)
    return @handler.apply {|e| e.__send__(id,*args)}
  end
end

class AreNot < HigherOrderMessage
  def method_missing(id, *args)
    return @handler.apply {|e| not e.__send__(id,*args)}
  end
end

Well, that's enough higher order messaging. The code is available on RubyForge in the Homer project for anybody who wants to play with it.

Posted on October 14, 2005 [ Permalink | Comments ]

Implementing Higher Order Messages in Ruby

An Envelope Manufacturing Machine

A higher order message is a message that takes another message as an "argument". It defines how that message is forwarded on to one or more objects and how the responses are collated and returned to the sender. How is this actually implemented?

Under the hood, a method for the higher order message creates and returns a new object that represents the higher order message. This object can capture any message, forward that message on to all the elements of the collection and collate the results that they return into a result that is passed back the sender of the higher-order message.

In Ruby, messages are captured by defining a method named "method_missing". "Method_missing" is invoked with the name and parameters of a message received by an object when the object has no explicit method for that message. That means, however, that a little more magic is required to implement higher order messages. Classes inherit a lot of methods from the Object class. These must be undefined so that they can be captured by method_missing. This is easy to do by calling undef_method. There are some methods that shouldn't be undefined: method_missing, of course, and fundamental methods that begin and end with double underscores. Here then is the base class for higher order messages:

class HigherOrderMessage
  def HigherOrderMessage.is_vital(method)
    return method =~ /__(.+)__|method_missing/
  end
    
  for method in instance_methods
    undef_method(method) unless is_vital(method)
  end
    
  def initialize(handler)
    @handler = handler
  end
end

I can then easily implement higher order message types by extending the HigherOrderMessage class and defining method_missing.

"Do" sends the captured message to all elements of a collection and returns nil:

class Do < HigherOrderMessage
  def method_missing(id, *args)
    @handler.each {|e| e.__send__(id,*args)}
    return nil
  end
end

"Where" selects elements of a collection for which the captured message returns true:

class Where < HigherOrderMessage
  def method_missing(id, *args)
    return @handler.select {|e| e.__send__(id,*args)}
  end
end

I can then add these higher order messages to all enumerable objects by adding them to the Enumerable mix-in:

class Enumerable
  def do
    return Do.new(self)
  end
  
  def where
    return Where.new(self)
  end
end

And that's it for the basics. I added more classes for the other higher order messages, and had to chain two higher order message objects to support the having predicate, but nothing more complex than that.

Anybody got any ideas about an equivalent in Java?

Update: The code is available on RubyForge in the Homer project for anybody who wants to play with it.

Posted on October 12, 2005 [ Permalink | Comments ]

Higher Order Messaging in Ruby

Sorting Mail

I've recently been experimenting with Higher Order Messaging, a design pattern that can be used for querying and manipulating collections of objects in a purely object-oriented manner.

On the way back from JAOO last week I whipped up an implementation in Ruby and applied it to some of my Ruby code. I was very pleased with the result. Higher Order Messaging transformed laborious code that queried and manipulated objects in collections into succinct statements that clearly expressed the application rules being implemented.

What is a Higher Order Message?

A higher order message is a message that takes another message as an "argument". It defines how that message is forwarded on to one or more objects and how the responses are collated and returned to the sender. They fit well with collections; a single higher order message can perform a query or update of all the objects in a collection.

It's probably easiest to look at an example.

The following code allocates government benefits to claimants. To start, here's the definition of a benefit claimant.

class Claimant
    attr_reader :name
    attr_reader :age
    attr_reader :gender
    attr_reader :benefits

    def initialize(name, age, gender)
        @name = name
        @age = age
	@gender = gender
	@benefits = 0
    end

    def retired?
        @gender == :male && age >= 65 ||
        @gender == :female && age >= 60
    end
    
    def receive_benefit(amount)
        @benefits = @benefits + amount
    end
end

In the following code snippets, the variable claimants contains an array of Claimant objects.

Let's compare different ways of implementing the following business rule: retired claimants receive benefits of 50 a week.

Here's how to implement this rule by explicitly iterating over the list of claimants using Ruby's for statement:

for claimant in claimants
    if claimant.retired? then
        claimant.receive_benefit 50
    end
end

Here's the same rule implemented with higher-order functions — methods that accept a block:

claimants.select {|e| e.retired?}.each {|e| e.receive_benefit 50}

And finally here's the same rule implemented with higher order messages:

claimants.where.retired?.do.receive_benefit 50

I think that the code using higher order messages most succinctly expresses the business rule being executed. It expresses what is being performed and hides the details of how. In comparison, the code with explicit iteration mingles business logic within procedural control flow and the code using blocks has a lot of syntactic noise and additional block parameters.

The higher order messaging version does have messy dots between messages, but unfortunately that's an aspect of Ruby we can't change. At the risk of sounding like a frothing evangelist, I have to admit that the code would be neater in Smalltalk:

claimants where retired do receiveBenefit: 50.

I'd also prefer to use the name "each" instead of "do", but the name "each" has already been grabbed by the Ruby Enumerable mixin.

Anyway...

In the example above there are two higher order messages, where and do. The where message returns an array containing those elements of a collection for which the following message returns true. The do message sends the following message to all elements of a list and returns nil. The clause "claimants.where.retired?" returns a list of claimants that are retired, and that list is then sent "do.receive_benefit 50". Thus, the single line means the same as:

retired_claimants = claimants.where.retired?
retired_claimants.do.receive_benefit 50

Higher order messages provide a convenient framework for querying and manipulating collections of objects that has the additional benefit of clearly expressing domain logic. Thanks to Ruby's open classes, I added the higher order messages to the Enumerable module, automagically defining them for Ruby's built-in arrays and all other enumerable classes.

More Useful Higher Order Messages

I've found the following higher order messages to be useful when working with collections.

Having filters a collection by a predicate applied to the result of a query sent to each element. This actually combines two higher order messages. For example, to give an additional benefit to claimants older than 100:

(claimants.having.age > 100).do.receive_benefit 25

Unless is the opposite of "where"; it returns the list of elements for which a predicate returns false:

working_claimants = claimants.unless.retired?

In_order_of and in_reverse_order_of sort on the value of an attribute:

sorted = claimants.in_reverse_order_of.benefits

Sum calculates the total of an attribute:

total_benefits = claimants.sum.benefits

Extract returns a collection containing the values of an attribute of the elements:

names = claimants.extract.name

Limitations of Higher Order Messages

Of course, there are limitations to what you can do with higher order messages. Most obviously, you can only pass a single message to a higher order message. This means, for example, that unlike the Array#select method, which accepts a block, the higher-order ?where message cannot accept a general logical expression as a predicate to filter the collection. Instead, you must define the predicate as a method of the objects in the collection.

On the other hand, it's impractical to move all the code that manipulates an object into methods of its class. Take, for example, a report of benefits per claimant that is implemented as follows using blocks.

claimants.sort {|c1,c2| c2.benefits <=> c1.benefits}.each do |c|
    puts "#{c.name}\t#{c.benefits}"
end

Implementing that report with higher order messages would require adding a "write_benefit_report_line" method to the Claimant class. This would mix presentation logic with application logic and reduce the cohesion of the Claimant class: not a good design decision.

To avoid this problem, I added a method to Enumerable that returns a collection of adapter objects that wrap the elements of the original collection. To write the benefits report above, I would now write a BenefitsReport class that can report the benefits for one claimant and create a collection of BenefitsReport objects from a collection of Claimants, as follows:

class BenefitsReport
    def initialise(claimant)
        @claimant = claimant
    end
	
    def display
        puts "#{@claimant.name}\t#{@claimant.benefits}\n"
    end
end

claimants.in_reverse_order_of.benefts.as(BenefitsReport).display

But are these really limitations? After using higher order messages for a while I've come to think that they are not. The first limitation encourages you move logic that belongs to an object into that object's implementation instead of in the implementation of methods of other objects. The second limitation encourages you to represent application concepts as objects rather than procedural code. Both limitations have the surprising effect of guiding the code away from a procedural style towards better object-oriented design.

Higher Order Messaging in Other Languages

It's easy to implement Higher Order Messaging in any dynamically typed OO language that can capture unknown messages. It was originally written in Objective C and Smalltalk and I wrote the Ruby implementation in only a couple of hours on the plane. I've described the implementation details in another post. I think this pattern would be most useful in languages that do not have a succinct syntax for higher order functions, such as Python.

I also tried, and failed, to implement higher-order messaging in Java 5 using the java.lang.reflect.Proxy class. If anybody can implement higher-order messaging in Java I'll be very impressed. Please let me know if you do.

Update: I've written a short post about how to implement this in Ruby.

Update: The code is available on RubyForge in the Homer project for anybody who wants to play with it.

Posted on October 6, 2005 [ Permalink | Comments ]

Don't Judge a Book by its Title

old-books.jpg

While talking over about our favourite books about software during lunch today we realised that the books people thought were particularly good at communicating general principles and concepts often had titles that made them appear limited to extremely specific technologies. None were of the "J2EE in 24 Hours for Complete Morons" variety, but they all had titles that obscured the good aspect of the book and probably made people not want to buy it.

Here's a few we came up with:

  • Unix Network Programming by W. Richard Stevens et al., is a good introduction to networking and network programming whether you use Unix or not.
  • Practical Cryptography by Niels Ferguson and Bruce Schneier does not really focus on cryptography but instead contains some excellent advice about the design and implementation of software systems.

Do you know of any other books that one could easily be missed by taking the title at face value? Recommend them in the comments, please.

Posted on September 5, 2005 [ Permalink | Comments ]

When Metaprograms Attack!

Shark Attack!

One of the most cunning, or, to be honest, most stupid, bits of code I ever wrote was a Tcl program that wrote itself. It was a browser for a distributed object directory that would let the user examine and invoke objects across the network.

The browser queried the directory service, used CORBA's reflection mechanism to interrogate objects and wrote new classes to represent the remote objects that it found and the GUI components to display objects, get and set their attributes, invoke methods and display returned values. The entire program was two or three screenfuls of code, but wrote itself to be much larger as it ran.

Unfortunately it contained a bug that occurred infrequently and seemingly at random. I couldn't work out where the bug was in the code that the program wrote because I couldn't put breakpoints into code that didn't exist until runtime. That meant that I couldn't find the real bug, the bug in the code that wrote the code that contained the bug that actually occurred in the running program.

Eventually I rewrote the program in a more traditional style. What a shame. It was so very nearly the coolest code I ever wrote.

Posted on March 23, 2005 [ Permalink | Comments ]

Encapsulation is not Information Hiding

A Gemini space capsule in orbit

I have recently had problems integrating code written on other projects into my own application. In each case, the problem was caused by a reference to an external resource — a file name, for example — being hard-coded in the class of some object in an internal package within the the code I was trying to use. The explanation for the design of these packages, and why I shouldn't change things, was that the dependency was "encapsulated" within the object in question.

Many articles and books use the word "encapsulation" as a synonym for "information hiding". However, encapsulation and information hiding are two separate, orthogonal concepts:

  • Information hiding conceals how an object implements its functionality behind the abstraction of the object's API.
  • Encapsulation ensures that the behaviour of an object can only be affected through the object's API.

Information hiding lets one build higher level abstractions on top of lower level details. Good information hiding lets one ignore details of the system that are unrelated to the task at hand.

Encapsulation ensures that there are no unexpected dependencies between conceptualy unrelated parts of the system. Good encapsulation lets one easily predict how a change to one object will, or will not, impact on other parts of the system. Achieving encapsulation requires the use of common coding techniques: defining immutable value types, avoiding global variables and singletons, copying collections or mutable value objects when storing them in instance variables or returing them from methods, and so forth.

When hiding information it is important that the right information is hidden in the right place. The problems I encountered were caused by information about the environment of the application that should have been specified at the application scope being hidden hidden in a lower level class that should have been passed that information, not known it a priori.

The problem I have with using the word "encapsulation" to mean "information hiding" is that encapsulation is always a good thing to do but hiding information in the wrong place is not. Suppose I say:

"Let's encapsulate the exact data structure used by the cache in the CachingStockLoader class." "Let's encapsulate the name of the application's log file in the CalculationProgressListener class."

That sounds good to me, but is it really?

I find it much easier to make good decisions when I am clear about when I am doing encapsulation and when I am doing information hiding. For example, I would restate the statements above to be explicit that they really refer to information hiding:

"Let's hide the exact data structure used by the cache in the CachingStockLoader class." "Let's hide the name of the application's log file in the CalculationProgressListener class."

That will make me realise that the first decision is correct but the second is suspect. Code that loads stocks should not have to care whether the loader caches previous requests in a hash table or red-black tree or whatever. The name of the application's log file, on the other hand, should probably not be hidden away in that CalculationProgressListener class; it should be specified by the application and passed to instances when they are constructed.

I find it essential to keep the distinction between "encapsulation" and "information hiding" in mind when thinking about design decisions or discussing them when pair programming. When I use the wrong word to think or talk about a design decision I find it harder to realise when the decision is incorrect. When I admit that I am doing information hiding, not encapsulation, I can better decide whether the information I am hiding should be hidden at all and if so, where I should hide it.

Posted on March 17, 2005 [ Permalink | Comments ]

Tell, Don't Log

Dictation

Logging is necessary for programs that run autonomously but I have often found that its implementation has had a bad effect on the maintainability of my code. Pulling the code that does logging out of the objects that generate log-worthy events helps a lot. In C#, events and delegates provide good support for such a separation.

Why do I find that logging made my code hard to maintain? Firstly, logging code ends up scattered throughout classes adding clutter. Secondly, logging reduces cohesion: objects must build log messages in addition to performing their functional behaviour. Finally, it is difficult to reuse classes in different situations if they log via static functions or logger objects, which is a common way of using log4j and related frameworks.

In my current project we are using C# events and delegates to separate logging from application functionality. Objects fire events to report that significant things have happened. Those events are routed (via delegates) to objects that know how to format the events into a log, or do something else with them.

For example, I might have an object that performs some complex and lengthy analysis upon an index of stocks. I want to log errors that happen during analysis, and also log progress so that I can see how the analysis is progressing when I run the tool at the command line.

public delegate void ProgressHandler( Stock stock, int current, int total );
public delegate void ErrorHandler( StockItem stock, Exception error );

public class StockAnalyser {
    public event ProgressHandler Progress;
    public event ErrorHandler Error;
    
    public StockAnalyser( IDataAccess dataAccess, IClock clock ) ...
    
    public void AnalyseStocks( string indexCode, IAnalysisStorage store ) {
        StockIndex index = dataAccess.LoadIndex(indexCode);

        int i;
        foreach ( Stock stock in index ) {
            try {
                double analysis = Analyse(stock);
            }
            catch( AnalysisException error ) {
                AnnounceError( stock, error );
            }
            
            store.Add( stock, analysis );
            AnnounceProgress( stock, i, index.Size );
            
            i = i+1;
        }
        
        store.Save();
    }

    protected virtual void AnnounceProgress( Stock stock, int current, int total ) {
        if (Progress != null) Progress(stock, current, total);
    }

    protected virtual void AnnounceError( Stock stock, Exception error ) {
        if (Error != null) Error(stock, error);
    }
    
    private double Analyse( Stock stock ) ...

    ...
}

In my command-line application, I can connect the events to logging methods like this:

public class App {
    public static int Main( string[] args ) {
        IDataAccess dataAccess = new DataEnvironment(... parsed from args ...);
        IClock clock = new SystemClock();
        string indexCode = ... parsed from args ...
        
        StockAnalyser analyser = new StockAnalyser(dataAccess, clock);
        
        // Hook the logging functions to the events
        analyser.Progress += new ProgressHandler(LogProgress);
        analyser.Error += new ErrorHandler(LogError);

        try {
            analyser.AnalyseStocks(indexCode);
            return 0;
        }
       catch( Exception e ) {
           Console.Error.WriteLine( "failed to start analysis: {0}", e );
           return 1;
        }
    }
    
    public void LogProgress( Stock stock, int index, int count ) {
        Console.Out.WriteLine( "{1} / {2}: {0}", stock.Code, index, count );
    }
    
    public void LogError( Stock stock, Exception error ) {
        Console.Error.WriteLine( "{0}: {1}", stock.Code, error );
    }
}

Using events makes it easy to replace logging with something more appropriate when the object is run in a different situation. For example, in a GUI environment I can connect the progress event to a progress bar and the error event to a method that collects errors to be shown to the user after the analysis has finished.

Of course, logging frameworks like log4j or log4net provide a lot of useful functionality: formatting, log rolling, different logging back-ends, etc. etc. I don't want to throw the baby out with the bathwater when removing logging code from my domain objects. Luckily, there is no reason that events and a big logging framework cannot work together: application objects can send events to objects that log those events through the framework. That way I get the best of both worlds.

Posted on January 18, 2005 [ Permalink | Comments ]

OOPSLA'04: Static Classes or Dynamic Objects?

More reflections from OOPSLA'04.

In the same session as the Mirrors paper were two presentations on automated tools to help programmers work with OO software.

One presented a tool for automatically refactoring class and interface hierarchies to improve cohesion based on usage patterns of class features. The result? For simple type hierarchies the tool neatly factored out distinct concepts. For example, given some code that used the Java collection framework it detected that some client code only read from collections and other code also modified them, and so extracted an "immutable collection" interface that was the supertype of a "mutable collection" interface. Cool! However, when given a larger type hierarchy the result was less helpful — a straightforward tree was transformed into lots of classes connected by a tangle of multiple-inheritance relationships.

Another presented a tool that automatically generated UML class diagrams from code to help maintenance programmers understand the system they are working on. The novel contribution of this work was an algorithm to determine the binary relationships between classes from the unidirectional references in the code. However, my experience of maintenance — most recently, I've spent the last six months performing maintenance on a million-line plus Java system — is that static class diagrams are not very helpful. Most of them would show that classes A, B and C use interface I that is implemented by classes X, Y and Z. That's of little use when you're trying to diagnose running code.

Both of these tools have the same limitation, in my opinion: they concentrate on the static relationships between classes instead of the dynamic relationships between objects. When it comes to refactoring, I find it is usually better to refactor to composition rather than inheritance. When doing maintenance I need help grasping the dynamic interaction between objects in the running system; static relationships are visible in the code and easy to understand. I would find a lot of use for a tool that would detect where I could replace inheritance with composition and a tool that would create instance diagrams to visualise a running system.

However, I'm not a huge fan of UML. When sketching out instance diagrams I find that an "instances with interfaces" notation, borrowed from the RM-ODP, to be very useful and easy for people to understand.

instance-sketch.png

In these diagrams, the class of an instance is indicated by the name inside the instance bubble and the interface through which an instance is used is indicated by the name next to the "T" of the interface poking out of the instance.

I like this notation because it's quick to sketch on a whiteboard during a design discussion or on paper while debugging. I can easily extend the notation to express attributes of interest depending on what I want to portray. I either name the relationships between instances, as above, or use instance variable names. In the example I've also used dotted lines to represent that one object is passed to another as a message parameter, and only used for the duration of the method. I sometimes distinguish public vs. private relationships by starting the arrows of public relationships from the edge of an instance bubble and those of private relationships from inside the instance bubble.

Posted on November 5, 2004 [ Permalink | Comments ]

OOPSLA'04: SeaSide BOF

seaside.jpg

More reflections on OOPSLA'04.

Before heading to the Vancouver Aquarium for the main social event of the conference I stuck my head into a BOF session about the SeaSide web application framework.

Wow!

SeaSide makes writing web apps startlingly simple and straightforward. In comparison, web app frameworks in the Java world, such as JSP or Struts, and even Ruby on Rails look like dinosaurs.

It does so by turning the conceptual model of most web-app frameworks on its head. Instead of dispatching requests from the client to objects that send back responses, a SeaSide application is written as a single thread of control that sends requests to the user and waits for their response. The application processes response data using the normal control flow statements that programmers are used to: no need for complex configuration files mapping forms to actions to JSP pages, etc etc. I commented, only partly tongue-in-cheek, that the configuration file of a Struts app would be larger than the entire domain logic of a typical SeaSide application doing the same thing.

SeaSide's model of a single thread of control per user is only a conceptual model. Under the hood, SeaSide takes care of what happens when the user navigages backwards and forwards or creates multiple browser windows to interact with the same application. The important thing is that the programmer doesn't have to care: SeaSide maintains the simple programming model as an abstraction above its complex internal implementation. SeaSide also makes it easy to run transactions, compose HTML pages from components and bind HTML links to program actions, all in a natural Smalltalk style.

SeaSide is able to present such a simple API because it takes advantage of "esoteric" features of its implementation language/platform, Smalltalk, such as continuations and closures. Java, in comparison to Smalltalk, is designed on the assumption that programmers using the Java language are not skilled enough use such language features without making a mess of things, and so the language should only contain simple features to avoid confusing our poor little heads. Paradoxically, the result is that Java APIs are overly complex, and our poor little heads get confused anyway. SeaSide is a good demonstration that powerful, if complex, language features make the job of everyday programming easier, not harder, by letting API designers create elegant abstractions that hide the complexity of the problem domain and technical solution.

Posted on November 2, 2004 [ Permalink | Comments ]

OOPSLA'04: Mirrors

mirror.jpg

More reflections on OOPSLA'04.

Gilad Bracha presented a paper cowritten with David Ungar on Mirrors, an architectural style for reflective APIs that cleanly separates meta-level and domain-level concerns — the authors term this "stratification".

Mirrors would address problems I have encountered recently: business logic had called meta-level getters and setters intended for use by the persistence layer and evolved over time into a confusing tangle of "train-wreck" statements. I imagine an OR mapper using a "persistence mirror" to view the persistent state of an object, rather than calling the object directly, and thereby dissuading programmers from using meta-level calls in domain code.

Posted on November 1, 2004 [ Permalink | Comments ]

Cryptic and Coffee Time

crossword.jpg

Reflection. When it was introduced in Java 1.1 I loved it. It made Java almost as flexible as real dynamic languages like Smalltalk, Python or Ruby. I used reflection to reduce boilerplate code and write functions to manipulate objects in generic ways. However, over time I backed off from my love of reflection. I found that it went against the "grain" of the language and mostly made life more difficult.

While travelling to my current client by train, I've started doing the cryptic crossword in the Metro, a newspaper that's given away free to London commuters. This got me thinking again about reflection and meta-level abstractions and how cryptic crosswords, oddly enough, can tell us something about programming.

Let's take a look at a couple of recent clues:

  1. Come in with article in container to amuse (9).
  2. Sad soldier in overturned vehicle (6).

Cryptic crossword clues are difficult because they mix concrete and meta levels and do not clearly distinguish the two. Part of the clue is a synonym of the answer, describing the concrete concept and the rest of the clue describes the word that is the answer, the meta level. Once you have identified which part of the clue is describing the concrete concept of the answer and which is describing the meta-level of the answer the clue is easy to solve but until then it is a puzzle.

The same applies to software. Mixing concrete and meta levels in the same code gets makes it hard to understand and therefore hard to maintain.

Reflection is a meta-level construct that is easy to abuse. Code navigation tools cannot statically analyse reflective code, which makes it difficult to determine the system's behaviour without running the system in a debugger and following each path of execution, a tedious process. However, transitions between concrete and meta levels do not require reflection. In an object oriented design, some objects are meta-level representations of other objects and meta-level transitions can be performed simply by calling a method or instantiating a class.

For example, persistence is a meta-level concept. Some object-relational mapping tools (no names, no pack drill) require that persistent objects expose state to the persistence mechanism through public getter and setter methods. These getter and setter methods are meta level concepts: they don't express domain concepts but expose how the domain concepts are implemented as Java objects. The danger is that programmers then think of these getters and setters as part of the object's domain-level API and invoke them in domain logic in other objects instead of implementing that logic in the class that actually owns the data. Domain logic then becomes separated from the data that it acts upon, increasing coupling between different parts of the code and making it harder to understand the intent of the logic. The benefits of an OO language are lost and the design regresses to a convoluted, procedural style.

A clear boundary between concrete and meta levels is key to guide understanding. In crossword puzzles, finding that boundary is what makes the answer obvious. In code, clear boundaries make it easier to understand what the code does, and therefore maintain the system.

In code, meta-level transitions fit naturally at tier or layer boundaries, for example, between the business logic and persistence mechanism, between the business workflow and user-interface or between an application and the transport protocol used for communication.

Let's look at those clues again, with the meta-level highlighted. Below, one side of the equals sign is a synonym of the answer, while the other side, in brackets, describes the words that make up the answer.

  1. [Come in with article in container] = to amuse
  2. Sad = [soldier in overturned vehicle]

Now it's much easier to work out the answers:

  1. "Come in" is a synonym for "enter", article is used in its grammatical meaning and in this case refers to "a", and a "tin" is a kind of container. So the clue tells us that the answer is a synonym for "to amuse", spelled out by "enter" followed by "a" inserted into the word "tin": Entertain.
  2. A "GI" is a soldier. "Overturned" means in reverse and indicates that a word for a vehicle will be spelled backwards. In this case the vehicle is a cart giving the letters "trac". So the answer is a synonym of "sad" spelled out by inserting "gi" into "trac": tragic.
Posted on September 22, 2004 [ Permalink | Comments ]

Wayne's World Methods

Excellent! ... NOT!

I've just been bitten by a common coding style: methods that take an argument which negates the meaning of the method's name. This makes the implementor's job easier but makes code that calls the method harder to read, and therefore makes maintenance more difficult than it should be.

I came across this style when using a test framework to write functional tests that drive the GUI of a large application. The framework provides methods to find components by name, make assertions upon their state and fake user activity. The framework provided several methods like:

void assertButtonIsEnabled( String buttonName, boolean isEnabled );

Which resulted in test code that looked like:

...
assertButtonIsEnabled( "editButton", false );
...

I was pairing with Dan Abel at the time, and we were both immediately reminded of an annoying catchphrase from the Wayne's World movie. We couldn't help reading the assertion as "assert that the edit button is enabled... not!" Not only was this aggravating; the assertions written in this style made the tests harder to read. A better approach would have been to define two methods:

void assertButtonIsEnabled( String buttonName );
void assertButtonIsDisabled( String buttonName );

And then the test would have read:

...
assertButtonIsDisabled( "editButton" );
...

I've written Wayne's World methods in the past, and I'll use then all time now I've seen the problems they cause... Shaaah! And monkeys might fly out of my butt!

Update: here are two good articles about the problems caused by boolean parameters and, more importantly, what to do instead:

Posted on August 9, 2004 [ Permalink | Comments ]

Sometimes it's right to be wrong

explanation.jpg

I have recently been part of a team bug-fixing and extending a huge distributed system that has been in constant development for several years. The hardest problems we had in understanding this code was a lack of consistency — different classes that had similar responsibilities all performed those responsibilities in significantly different ways. Some of the code was pretty awful and needed refactoring, but everybody who had added to the system had had their own ideas of the "right" way in which things should be done. The end result was that there no longer was a right way to do things, just lots of wrong ways.

When writing code that must be maintained by others it is better to be consistently wrong than to be inconsistently right. For example, if you have to modify a large codebase in a limited amount of time and your changes affect code that is poorly designed you have three options:

  1. Refactor the rest of the codebase to work the way you think it should, and then make your changes.
  2. Write your code to work in the same way as the rest of the codebase, even though it's not the best way it could be done.
  3. Write your changes in the way you think is best, despite what the rest of the code does, and leave the rest of the code untouched.

When pressed for time the first option is not practical, but the second option just propagates bad design decisions. No matter what the situation, however, the third option is always the wrong choice.

If you cannot refactor, be consistent. Consistency makes code easier to understand and maintain, even if the code is consistently awful.

Posted on July 5, 2004 [ Permalink | Comments ]

Partially Constructed Objects

The Titanic under construction

Painful experience has taught me that it's a bad idea for an object to pass a reference to itself out from its own constructor. The danger is that until the constructor has finished running the object is only partially constructed. A self reference that leaks out of the constructor can end up being used to call back into the object while it is in an inconsistent state, thereby breaking the contract of its interface. My favourite solution is to use third-party composition: if a relationship exists between two objects, some other object should establish the relationship.

A common case in which I exposed references to partially constructed objects was when I defined composite objects in which a parent created its children in its constructor and children held back-references to their parents.

For example:

class Parent {
    List children = new ArrayList();
    String name;
    
    public Parent(String name) {
        this.children.add( new Child(this,"child1") );
        this.children.add( new Child(this,"child2") );
        this.name = name;
    }

    public String getName() {
        return name;
    }

    ...
}

public class Child {
    Parent parent;
    String name;

    public Child( Parent parent, String name ) {
        this.parent = parent;
        this.name = name;
    }
    
    public String getName() {
        return fullName;
    }

    ....
}

Code like this contains a subtle trap that can trick an unwary programmer into introducing a bug during later maintenance. The parent reference is passed to the child before the parent is fully constructed. If the child constructor is changed to call back to the parent, it will call into a partially constructed object before that object can fulfil the preconditions of the called method.

Here's a contrived example. The call to parent.getName() will throw a NullPointerException.

public class Child {
    Parent parent;
    String fullName;

    public Child( Parent parent, String name ) {
        this.parent = parent;
        this.name = parent.getName() + "." + name;
    }
    
    public String getName() {
        return name;
    }

    ...
}

The problem is easy to spot in this tiny example, and easy to avoid. But in a large system it can be difficult to determine when the object refered to by a method parameter is constructed, and therefore which parameters of a method refer to partially constructed objects that cannot safely be invoked. The programmer will assume that all references passed to a method refer to valid objects. After all, it should be impossible to obtain a reference to an unconstructed object, right?

The best solution I have found is to use third-party compostion. If there is a relationship between two objects, a third object — often the object that constructs the two related objects — should establish that relationship. Because the related objects are constructed before the relationship is established, they can safely call each other's methods.

For example, if the Parent/Child example was rewritten in this style, the parent/child relationship would be established externally to both the parent and the child:

Parent parent = new Parent("parent");
Child child1 = new Child("child");
Child child2 = new Child("child2");

parent.addChild(child1);
parent.addChild(child2);

This style of composition makes the architecture of the system easier to understand. The relationships between objects at the same architectural level are explicitly defined in the same part of the code, instead of being scattered around the constructors of related objects.

This style also makes it easier to cleanly implement application configuration. The system can interpret a configuration file as directives to instantiate objects and plug them together.

Posted on April 21, 2004 [ Permalink | Comments ]

Joking Aside

A very scary clown

Anybody who knows me will tell you that I find it hard to pass up an opportunity to make a bad joke, but even I have learned that there's a time and a place for humour and writing source code is not it. Programmers spend most of their time working with and thinking about language, so it's only natural that they find plenty of opportunity to sneak puns and other word play into source code. But jokes in source code are a minefield for programmers who have to maintain the code. Every joke, by its very nature, confuses the reader and slows down maintenance work.

A long time ago I wrote some marshalling/demarshalling code for a middleware platform. I made the mistake of leaving a groanworthy pun in the source code which caused much lack of merriment when rediscovered.

A bit of background on marshalling: data sent between machines of different endianness must be byte swapped during communication. In some protocols, data is always sent in "network byte order", traditionally big endian, which is inefficient when the sender and receiver both have the opposite byte order. A better scheme is for the sender to include a field in the header that indicates the endianness of the following data. The sender does not have to byte swap outgoing data at all and the receiver must only byte swap incoming data if the endianness of the data is different than its native endianness.

Some systems, CORBA for example, compile the middleware libraries to know their endianness and send a boolean flag indicating big-endian or little-endian. Because we were planning to port the system to different architectures, I didn't want maintain the endian flag for each architecture -- that sounded like too much manual work. Instead, my protocol used a 2-byte field in the header of each message. The sender set the field to a well-known numeric value, each byte of which was different. The receiver checked the received value of the field against the well-known value. If they were different, the the data needed to be byte swapped.

I called the header field 'elvis' and used 57005 for the well-known numeric value. Why? Because the code that detected the byte order then looked like this:

void receive( Message message ) {
    short elvis = message.readShort();

    if( elvis != 0xDEAD ) {
        message.setByteSwapped(true);
    }
     ...
}

HAHAHAHA hahaha ... ha ha ... ha ... ha? ... err... maybe you had to be there.

Amusing or not, the joke makes the code far too hard to understand. I should have written it like this:

static final short LOCAL_BYTE_ORDER = 0x1234;
 
...
 
void receive( Message message ) { 
    short messageByteOrder = message.readShort();
     
    if( messageByteOrder != LOCAL_BYTE_ORDER ) {
        message.setByteSwapped(true);
    }
    ...
}

The difference is obvious. It may be less amusing to the programmer writing the code but, more importantly, it is less infuriating for the programmer maintaining the code.

Posted on April 3, 2004 [ Permalink | Comments ]

Call With Result

eternity.jpg

My new version of the Scene Beans animation framework has a design inspired by functional programming: it represents animations as immutable data structures, which facilitates the coordination of concurrent animation and rendering and provides other benefits. Very elegant in principle, but I occasionally need to construct circular structures of objects to represent repeating behaviours. The semantics of Java make it impossible to compose circular structures of immutable objects. What I really need is a Java implementation of call-with-result, a primitive of HScheme that "allows you to call a function with an argument that contains the result of the function". That's right, in HScheme you can pass the result of a function call as an argument to the very call that calculates that result! How frustrating: what I would normally consider an intriguing but esoteric language mechanism is exactly what I need but don't have.

Update: O'Caml provides exactly what I need as well: the let rec statement let's one create circular immutable data structures. Perhaps Java will provide something similar in some future version but, judging by how limited Java's generics are compared to those of O'Caml and other languages, I don't hold much hope.

Posted on March 1, 2004 [ Permalink | Comments ]

Saying nothing is not the same as saying "nothing"

Three Wise Monkeys

The DynaMock mock objects API allowed programmers to specify that a method had no arguments by not specifying anything about expected arguments at all. This confused users of the library. The new version of the API, now being developed under the jMock banner, forces users to specify expected arguments in detail. Users prefer the clarity, despite the extra typing, because it helps them avoid subtle errors.

I recently answered a question on the Mock Objects mailing list. A user of the library had thought that passing no explicit argument constraints to an expectation set a mock object up to expect any arguments. In fact, it sets the mock object up to expect no arguments.

mock.expectAndReturn( "methodName", result );

Is the same as:

mock.expectAndReturn( "methodName", NO_ARGS, result );

And different from:

mock.expectAndReturn( "methodName", ANY_ARGS, result );

In a spooky example of synchronicity, I had been hacking on the implementation of that syntactic sugar the previous afternoon and had come to the conclusion to remove it because (a) it was an extra set of overloaded methods that needed to be maintained and I'm lazy and (b) it was confusing because it hid the intent of the programmer instead of expressing it clearly. The user request just confirmed my opinion that this particular piece of syntactic sugar is a mistake.

The new "hot mock" API will require explicit specification of argument constraints when setting up expectations. I showed an example to people at a recent XtC session.

mock.method("methodName").with(eq(arg1),same(arg2)).willReturn(result)
    .expectOnce();

I expected complaints from people who didn't like the extra typing required, but to my surprise, they actually preferred code with explicit constraints over the existing syntactic sugar. So, users get more readable tests and the jMock team gets fewer overloaded methods to maintain. It's a win-win situation.

I've just realised that the definition of 'nothing' has tripped me up before. The concept of nothingness seems to be more complex than is immediately obvious. Something to keep in mind in future designs.

Posted on January 7, 2004 [ Permalink | Comments ]

Blocks and Glue

Blocks and Glue

When writing Ruby libraries it is tempting to use code blocks as a way for the user to hook into the library or specify custom behaviour. This leads to a lot of code duplication, since blocks cannot easily be factored into a class hierarchy or mixins. Code blocks are no substitute for a good object model. Blocks are good for creating control structures and "glue" between objects, but should be refactored into methods or classes when they become longer than one or two lines.

When writing and using the Ruby dynamic mock library I used code blocks as the way that the programmer defined expected method signatures and checked expectations. For example:

mock.expect :set_property do |name,value|
    assert_equals( "Content-Type", name )
    assert_equals( "text/html", value )
    return nil
end

However, using blocks in this way has a number of disadvantages:

  • They cannot be turned into a useful representation in error messages
  • There is a lot of duplicated assert statements among expectations
  • Expectations cannot easily be reused
  • Expectations aren't represented as named concept, they just exist as undifferentiated lines of code

Constraints solved all these problems: they name what they are used for (e.g. IsEqual vs. IsSame), can easily be reused and combined (and, or, not), can describe themselves (toString). With Java dynamic mocks, the expectation above would be written as follows, where eq is a factory method that creates an IsEqual constraint.

mock.method("setProperty")
    .with(eq("Content-Type"),eq("text/html")).willReturn(null)
    .expectOnce();

Factoring out the concept of a constraint into the Constraint interface and various implementations gave us additional benefits for free: we could use Constraints to set up different expectations based on argument values, for example.

I've come to the conclusion that blocks are good for creating control structures and "glue" between objects, but should be refactored into methods or classes when they become longer than one or two lines.

Posted on December 31, 2003 [ Permalink | Comments ]

Arose buy any other name wood smell as suite.

wordsworth.jpg

Programming is sometimes like writing poetry (although with perhaps slightly less social stigma): the choice of words is of utmost importance. Using the right word in the right place can make your code immediately comprehensible, even elegant. Conversely, choosing the wrong word can make code unbelievably complex. You have to be especially careful of words with more than one meaning, even more so when those meanings are similar but subtly different.

When writing the dynamic mock library I used the word "call" to refer to an incoming method call that was to be checked and given a mock behaviour by a mock object. A dynamic mock object intercepted each method call, created a Call object to represent the call, and passed it to a Callable object that checked expectations and stubbed their behaviour. A Callable had a method called "call" that took one argument, a Call called "call". Err... got that? Or are you getting as confused as I did? Perhaps some code will help...

interface Callable {
    public Object call( Call call ) throws Throwable;
}

Here's how it is called:

Call call = new Call( ... );
Callable callable = findMatchingCallable(call);
return callable.call( call );

That naming scheme is incredibly confusing. The word "call" is being used as a noun and a verb, to refer to a class, a variable holding a reference to an instance of that class, and to a method. And that's ignoring that "called" also means "named"!

As part of the refactoring of the jMock codebase we renamed Call to Invocation and the call method to invoke:

interface Invokable {
    public Object invoke( Invocation invocation ) throws Throwable;
}

Now the framework uses distinct nouns and verbs, and code is much easier to read. More importantly, members of a pair can talk about the code while programming without getting each other completely confused.

Posted on December 30, 2003 [ Permalink | Comments ]

You can't make something out of nothing

nothing.jpg

Some time ago, in my incarnation as an academic researcher, I wrote a Java framework for animating 2D graphics called Scene Beans. While adding features to the framework I reused a Null Object class in a way that exposed the class to the user of the framework. This caused unexpected problems during later development. This taught me that Null Objects should not be used to model domain concepts but should be treated as internal implementation details of a framework and hidden from client code.

The SceneBeans framework provides a scene graph data structure that defines a graphical scene as a directed graph of polymorphic scene-graph "nodes" that are processed by Visitors. I used the Null Object pattern to mark the edges of the Scene Graph, and combined the Null Object and Visitor patterns so that the Null Object nodes did not double dispatch to the visitor — as far as the visitors knew, the Null Object nodes did not even exist.

One of the scene graph classes was a "switch" node that would contain multiple subgraphs but only draw one at a time. I later needed to selectively show or hide parts of the scene. I realised that I could do this by putting a subgraph and a Null Object node into to a switch node; switching between the subgraph and the Null Object would have the effect of showing or hiding the subgraph. This only required changing the Scene Beans file format so that files could explicitly specify Null Objects in the graph. Cunning!

Or so I thought...

I noticed a problem when we wrote a visitor to write a scene graph into a file in our format. The visitor never wrote the Null Objects because they did not double dispatch to it. This was fine for Null Objects at the edge of the graph, but was the wrong behaviour for Null Objects that had been explicitly created to be included in "switch" nodes.

The problem was that my modifications had changed the way that the framework used Null Objects, from using them as an internal implementation detail to mark the edges of the scene graph to using them to represent the concept of "draw nothing" in a user-visible way.

Solutions in reverse order of elegance:

  1. Explicitly check for Null Objects in the save-to-file visitor. (A quick-and-dirty fix that I ruled out for obvious reasons)
  2. Make the Null Objects call back to a visitNull method on the visitor interface, and provide a do-nothing default implementation of visitNull in the abstract base class from which all visitors are derived.
  3. Use different classes for Null Objects that mark the edge of the graph from those used to represent "draw nothing".

This experience taught me that when changing code that uses the Null Object pattern, one must beware of modifying the system from using Null Objects as mere implementation details to using Null Object classes to represent domain concepts. This goes against the intent of the Null Object pattern and will cause headaches at later date.

Posted on December 12, 2003 [ Permalink | Comments ]

Confounded by configuration

Sweet love, I see, changing his property, Turns to the sourest and most deadly hate. — William Shakespeare.

I recently wrote a simulation game as a Java applet for the website of a well known motor racing team. The configuration of the game — track, car performance, local weather and even the GUI layout and branding — was read from property files when the game started up.

The way I designed it, any object that needs to be configured implements the Configurable interface:

public interface Configurable {
    public void configure( PropertySet properties )
        throws ConfigureException;
}

Configurable objects construct themselves into some neutral, initial state which is then overridden by the values read from the property set (a set of name-value pairs) passed to the configure method.

The init method of the applet loads the properties, creates the objects it needs and configures them (or catches and records errors). Then the start method checks that everything is properly configured and ready to go and kicks off the animation thread.

A nice, simple mechanism? No. Another sodding mistake!

This architecture makes it much too hard to write unit tests. Any configurable object that needs to be used in a test must be configured, but that can only be done indirectly by creating a property set for it. There are several ways of creating property sets, but all of them have drawbacks. I can build property sets explicitly in the set-up phase of each test, but that makes the tests very verbose. I could load propertys set from files, but that splits test suites into separate code and data files, which makes editing tests a real pain. I could load a property set from a string constant is has the same drawback as building a property set explicitly, except with a worse syntax and more throws clauses. The problem is only compounded by composite configurable objects that must create and configure their sub-objects as directed by their own configuration.

As a work around, I could expose configurable properties of the objects as getters and setters, but methods that are only used for testing make me feel uncomfortable; they are a definite sign of a bad design.

The problem is with the architecture itself. The mistake was to hide configuration within the objects being configured. I should have pulled it out into one or more objects that interpret configuration data by creating objects with the appropriate state. I would no longer need the Configurable interface, because configurable properties would be passed to the objects' constructors. Tests could use the same constructors to create objects. The configuration interpreter would be easy to test by using a factory to create objects and using a mock factory object in tests. The mock factory would also make the configuration of composite objects trivial to test: creating a composite object would involve asking the factory to create a sub-object multiple times.

This is basically the GoF Builder pattern.

Oh well, I learn from my mistakes. Thank goodness.

Posted on November 17, 2003 [ Permalink | Comments ]

Buried under a sugar coating

scorpion-lollipop.jpg

Tim Mackinnon and I recently did a pair programming session to redesign the guts of the Java Dynamic Mock Objects implementation. One outcome of the session was a lot of methods that provided convenient "syntactic sugar" for setting up common expectations. Those sugar methods just delegated to lower level methods that let you define any kind of expectation in a long winded but extensible manner.

Unfortunately we forgot to make the lower level methods public. The flexible, extensible core was inaccessible to users of the API, who soon started asking awkward questions on the mailing list.

Memo to self: syntactic sugar is layered above an API as a sweetener. Implement the powerful, low level API first. Any kind of sugary syntax can be implemented afterwards.

Posted on November 15, 2003 [ Permalink | Comments ]