Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Wednesday, August 8, 2012

Testing the Chain of Responsibility, Part 2

Download the podcast

Chain Composition Behaviors

We always design services for multiple clients.  Even if a service (like the Processor service in our example) has only a single client today, we want to allow for multiple clients in the future.  In fact, we want to promote this; any effort expended to create a service will return increasing value when multiple clients end up using it.

So, one thing we definitely want to do is to limit/reduce the coupling from the clients’ point of view. The run-time view of the CoR from the client’s point of view should be extremely limited:

Note that the reality, on the right, is hidden from the client, on the left.  This means we can add more processors, remove existing ones, change the order of them, change the rules of the termination of the chain, change how any/all of the rules are implemented... and when we do, this requires no maintenance on the clients.  This is especially important if there are (or will be, or may be) clients that we don’t even control.  Maybe they live in code belonging to someone else.

The one place where reality cannot be concealed is wherever the chain objects are instantiated.  The concrete types, the fact that this is a linked list, and the current order of the list will be revealed to the entity that creates the service.   If this is done in the client objects, then they all will have this information (it will be redundant).  Also, there is no guarantee that any given client will build the service correctly; there is no enforcement of the rules of its construction.  

This obviously leads us to prefer another option.  We may, for example, decide to move all creation issues into a separate factory object.

It may initially seem that by doing so we’re just moving the problem elsewhere, essentially sweeping it under the rug. The advantage comes from the fact that factory objects, unlike clients,  do not tend to increase in number.  So, at least we’ve limited our maintenance to one place.  Also, if factories are only factories then we are not intermixing client behavior and construction behavior.  This results in simpler code in the factories, which tends to be easier to maintain.  Finally, if all clients use the factory to create the service, then we know (if the factory works properly) that the service is always built correctly.

We call this the separation of use from creation, and it turns out to be a pretty important thing to focus on.  Here, this would lead us to create a ProcessorFactory that all clients can use to obtain the service, and then use it blindly.  Initially, this might seem like a very simple thing to do:

public class ProcessorFactory {
    public Processor GetProcessor() {
           return new LargeValueProcessor(
new SmallValueProcessor(
new TerminalProcessor()));

Pretty darned simple.  From the clients’ perspective, the issue to specify in a test is also very straightforward: I get the right type from the factory:

public class ProcessorFactoryTest {
    public void TestFactoryReturnsProperType() {
         Processor processor =
              new ProcessorFactory().GetProcessor();
         Assert.IsTrue(processor is Processor);

This test represents the requirement from the point of view of any client object.  Conceptually it tells the tale, though in strongly-typed language we might not want to actually write it.  This is something the compiler enforces, and therefore is a test that actually could never fail if it compiles.  Your mileage may vary.

However, there is another perspective, with different requirements that must also be specified.  In TDD, we need to specify in tests:

  1. Which processors are included in the chain (how many and their types)
  2. The order that they are placed into the chain (sometimes)  [4]

Now that the rules of construction are in one place (which is good) this also means that we must specify that it works as it should, given that all clients will now depend on this correctness.

However, when we try to specify the chain composition in this way we run into a challenge:  since we have strongly encapsulated all the details, we have also hidden them from the test.  We often encounter this in TDD; encapsulation, which is good, gets in the way of specification through tests.

Here is another use for mocks.  However, in this case we are going to use them not simply to break dependencies but rather to “spy” on the internal aspects of an otherwise well-encapsulated design. Knowing how to do this yields a huge advantage: it allows us to enjoy the benefits of strong encapsulation without giving up the equally important benefits of a completely automated specification and test suite.

This can seem a little tricky at first so we’ll go slow here, step by step.  Once you get the idea, however, it’s actually quite straightforward and a great thing to know how to do.

Step 1: Create internal separation in the factory

Let’s refactor the factory just a little bit.  We’re going to pull each object creation statement (new x()) into its own helper method.  This is very simple, and in fact most modern IDEs will do it for you; highlight the code, right-click > refactor > extract method..

public class ProcessorFactory {
    public Processor GetProcessor() {
           return MakeFirstProcessor(

    protected virtual Processor MakeFirstProcessor(
Processor aProcessor)    {
           return new LargeValueProcessor(aProcessor);

    protected virtual Processor MakeSecondProcessor(
Processor aProcessor)    {
           return new SmallValueProcessor(aProcessor);

    protected virtual Processor MakeLastProcessor() {
           return new TerminalProcessor();

Note that these helper method would almost certainly be made private by an automated refactoring tool.  We’ll have to change them to protected virtual (or just protected in a language like Java where methods are virtual by default) for our purposes.  You’ll see why.

Step 2: Subclass the factory to return mocks from the helper methods

This is another example of the endo testing technique we examined in our section on dependency injection:

private class TestableProcessorFactory : ProcessorFactory {
    protected override Processor MakeFirstProcessor(
Processor aProcessor)    {
           return new LoggingMockProcessor(
typeof(LargeValueProcessor), aProcessor);

    protected override Processor MakeSecondProcessor(
Processor aProcessor)    {
           return new LoggingMockProcessor(
typeof(SmallValueProcessor), aProcessor);

    protected override Processor MakeLastProcessor() {
           LoggingMockProcessor mock = new LoggingMockProcessor(
typeof(TerminalProcessor), null)
mock.iElect = true;
           return mock;

This would almost certainly be a private inner class of the test.  If you look closely you’ll see three important details.  

  • Each helper method is returning an instance of the same type (which we’ll implement next),  LoggingMockProcessor, but in each case the mock is given a different type to specify in its constructor [5]
  • The presence of the aProcessor parameter  in each method specifies the chaining behavior of the factory (which is what we will observe behaviorally through the mocks)  
  • The MakeLastProcessor() conditions the mock to elect.  As you’ll see, these mocks do not elect by default (causing the entire chain to be traversed) but the last one must, to specify the end of delegation

Step 3: Create a logging mock object and a log object to track the chain from within

Here is the code for the mock:

private class LoggingMockProcessor : Processor {
    private readonly Type mytype;
    public static readonly Log log = new Log();
    public bool iElect = false;
    public LoggingMockProcessor (Type processorType,
Processor nextProcessor):base(nextProcessor) {
           mytype = processorType;

    protected override bool ShouldProcess(int value) {
           return iElect;

    protected override int ProcessThis(int value) {
         return 0;

The key behavior here is the implementation of ShouldProcess() to add a reference of the actual type this mock represents to a logging object.  This is the critical part -- when the chain of mocks is asked to process, each mock will record that it was reached, the type it represents, and we can also capture the order in which they are reached if we care about that.

The implementation of  ProcessThis() is trivial because we are only interested in the chain’s composition, not its behavior.  We’ve already fully specified the behaviors in previous tests, and each test should be as unique as possible.  

Also note that this mock, as it is only needed here, should be a private inner class of the test.  Because the two issues inclusion and sequence are part of the same behavior (creation), everything will be specified in a single test.

The Log, also a private inner class of the test, looks something like this:

private class Log {
    private List<Type> myList;
    public void Reset() {
           myList = new List<Type>();
    public void Add(Type t) {

    public void AssertSize(int expectedSize) {
           Assert.AreEqual(expectedSize, myList.Count);

    public void AssertAtPosition(Type expected, int position) {
           Assert.AreEqual(expected, myList[position]);

It’s just a simple encapsulated list, but note that it contains two custom assertions.  This is preferred because it allows us to keep our test focused on the issues it is specifying, and not on the details of “how we know”.  It makes the specification more readable, and easier to change.  

(A detail: The log is “resettable” because it is held statically by the mock.  This is done to make it easy for all the mock instances to write to the same log that the test will subsequently read.  There are other way to do this, of course, but this way involves the least infrastructure.  Since the log and the mock are private inner classes of the test, this static member represents very little danger of unintended coupling.)

Step 4: Use the “spying” capability of the mock in a specification of the chain composition

Let’s look at the test itself:

public void TestFactoryReturnsProperChainOfProcessors() {
    // Setup
    ProcessorFactory factory = new TestableProcessorFactory();
    const int correctChainLength = 3;
    List<Type> correctCollection =
new List<Type> {
typeof (LargeValueProcessor),
               typeof (SmallValueProcessor),
               typeof (TerminalProcessor)
    Processor processorChain = factory.GetProcessor();
    Log myLog = LoggingMockProcessor.log;
// Trigger     

    // Verification
for (int i = 0; i < correctCollection.Count; i++) {
           myLog.AssertAtPosition(correctCollection[i], i);

If the order of the processors was not important, we would simply change the way the log reports their inclusion:

// In Log
public void AssertContains(Type expected){

...and call this from the test instead.

// In TestFactoryReturnsProperChainOfProcessors()
for (int i = 0; i < correctCollection.Count; i++) {

Some testing frameworks actually provide special Asserts for collections like this.


OK, we know what some of you are thinking.  “Guys, this is the code you’re testing:”

public Processor GetProcessor() {
           return MakeFirstProcessor(

“...and look at all the *stuff* you’ve created to do so!  Your test is several times the size of the thing you’re testing!   Arrrrrrrrrgh!”

This is a completely understandable objection, and one we’ve felt in the past.  But to begin with remember that in our view this is not a test, it is a specification.  It’s not that unusual for specifications to be longer than the code they specify.  Sometimes it’s the other way around.  It just depends on the nature of the specification and the implementation involved.

The specification of the way the space shuttle opened the cargo bay doors was probably a book. The computer code that opened it was likely much shorter.

Also, this is a reflection of the relative value of each thing.  Recently, a friend who runs a large development team got a call in the middle of the night, warning him of a major failure in their server farm involving both development and test servers.  He knew all was well since they have offsite backups, but as he was driving into work in the wee hours he had time to ask himself “if I lost something here... would I rather lose our product code, or our tests?”
He realized he would rather lose the product code.  Re-creating the source from the tests seemed like a lot less work than the opposite (that would certainly be true here).  But what that really means is that the test/specifications actually have more irreplaceable value than the product code does.

In TDD, the tests are part of the project.  We create and maintain them just like we do the product code.  Everything we do must produce value... and that’s the point, not whether one part of the system is larger than another.  And while TDD style tests do certainly take time and effort to write, remember that they have persistent value because they can be automatically verified later.

Finally, ask yourself what you would do here if the system needed to be changed, say, to support small, medium, and large values?  We would test-drive the new MediumValueProcessor, and then change TestFactoryReturnsProperChainOfProcessors() and watch it fail.  We’d then update the factory, and watch the failing test go green. We’d also have automatic confirmation that all other tests remained green throughout.

That’s an awfully nice way to change a system.  We know exactly what to do, and we have concrete confirmation that we did exactly and only that.  Such confidence is hard to get in our business!



[4] Some CoRs require their chain elements to be in a specific order.  Some do not.  For example, we would not want the TerminalProcessor to be anywhere but at the end of the chain.  So, while we may not always care about/need to specify this issue, it’s important to know how to do it.  So we’ll assume here that, for whatever domain reason, LargeValueProcessor must be first, SmallValueProcessor must be second, and TerminalProcessor must be third.

[5] We’re using the class objects of the actual types.  You could use anything unique: strings with the classnames, an enumeration, even just constant values.  We like the class objects because we already have them.  Less work!


  1. Testing for a particular implementation ties the test to the implementation. Can you show the behavior-based tests? Those tests whould demonstrate how the tests suggested that the implementation be desiged using a chain of responsiblity.

  2. We're preparing a future blog where we will show the CoR being driven to by behavioral tests. However, please note that "behavior" is relative, as "client" is relative. To the part of the system that calls the factory to get the chain, the behavior is getting the correct chain. So, we are specifying the behavior at that level.

    1. I look forward to seeing how the CoR was chosen, since this factory only exists in that context.

  3. Hi, guys, just wanted to share a quick feedback. I tried two alternative ways that came to my mind and would like to share them along with some thoughts. I'd be very happy to hear your comments on pros, cons etc.

    First of all, seeing some custom code in the solution you provide, I thought it would be an interesting experiment to go the other way and actually use a mocking framework, looking at where it would lead me (which could be a valuable feedback in the manual vs dynamic mocks topic that was also discussed on the blog).

    -- Solution 1: use assembly through methods rather than constructors and delegating construction of parts to a separate object:

    public void ApproachWithFactoryOfParts()
    var partsFactory = Substitute.For&ltPartsFactory>();
    var chainAssembly = new ValueProcessingChainAssembly(partsFactory);

    var largeValueProcessor = Substitute.For<Processor>();
    var smallValueProcessor = Substitute.For<Processor>();
    var terminalProcessor = Substitute.For<Processor>();


    var createdChain = chainAssembly.Perform();

    Assert.AreEqual(largeValueProcessor, createdChain);

    And my thoughts:
    - this spec includes both assert and mock verifications. Isn't it a sign that it's a mix of functional spec and workflow spec (and such mixes are not considered healthy)? Comparing it with your solution, I think it's the price I paid for removing the need to actually invoke Process() on the chain to check the order.
    - Thinking about composition in terms of plain methods instead of constructors is more mock framework friendly, however, it leaves us with an additional "scenario" - a processor without next step set. This could be solved in few ways, each having its pros and cons, like using this unset element instead of terminal step or make the factories supply each step with terminal step as its next step.
    - It looks like this suits only the ordered chain case.

    -- Solution 2: use factory only to specify the sequence of steps, leaving assemly to another object (a builder):

    public void ApproachWithBuilder()
    var builder = Substitute.For<ValueProcessorChainBuilder>();
    var chainAssembly = new ValueProcessorChainAssembly(builder);
    var anyProcessor = Any.InstanceOf<Processor>();


    var createdChain = chainAssembly.Perform();

    Assert.AreEqual(anyProcessor, createdChain);

    And my thoughts:
    - although it looks pretty nice, it accomplishes only part of what we aim for - the order/existence of steps in chain. The actual assembly responsibility is moved onto the builder and has to be specified with additional tests, which is more work. I think specifying the builder can be done in a smart way, since all the builder methods would perform similar logic, with the major difference being the type added to chain (and potentially constructor arguments that would make the case a little bit more difficult)
    - this approach can be a bit difficult using manual mocks, since it uses a method chain (remember I was complaining about mocking frameworks having the ability to mock method chains? This is one of the situations where I think feature can be of any use)
    - this approach could be potentially changed to allow unordered chains to be specified - we'd just have to make the builder non-chainable and change the spec to verify each call separately.

    That's it, I hope it can be of any value. I think I'll try and dig into the fluent builder solution later to see how builder can be specified and check whether this is a dead alley or something ever worth considering.

    Thanks for a great post!

  4. I'm going to defer to Amir on this one, since he is preparing the blog on auto-vs-manual mocks.

  5. Thanks for writing this blog about TDD. I have found it entertaining and useful. It has opened my eyes to a new way of thinking about TDD. I have a new perspective and feel it will have a positive impact on my work. I had not thought of TDD tests as specification tests in the past.

    I do need to register some objections though. In particular, I think this blog entry has major problems. To start with the simpler objections, I'll begin by describing what I infer to be the specification for this project. When working with my clients, they would have written a specification similar to the following: "The system shall double values in the range of 1 to 10000 and halve values in the range 10001 to 20000. Other input is not valid." Let's call this the client specification. In general, the client specification may make no mention of the invalid inputs. It is often implied and up to the developer to specify that and confirm with the client.

    You'll notice that nowhere did the client specify that a chain of responsibility pattern should be used. Choosing that pattern was the developer's decision. Sure, it is a fine decision, but it should be considered part of the developer's specification. Also, nowhere did the client specify the order of the rules. In fact, the way in which the system is implemented there is no reason for the "large-value" rule to be ordered either before or after the "small-value" rule.

    Up until now, all of the specification tests have been unit tests which have validated either some aspect of the client specification or they have validated some behavioral aspect of the CoR pattern which the developer plans to rely on. So far, so good. Now however, the specification tests are testing aspects of the system which have not and need not be specified. I smell over-specification. There is no reason to specify that the "large-value" processor is first. It could be second without impacting the client specification. The fact that it is first is an unimportant implementation detail which the developer happened to choose. There is no reason to specify it further and codify with a specification or unit test. Over time, over-specification and extra tests will result in making the system harder to maintain and modify. Such over-specifications are especially pernicious because the effort that went into writing them implies that they are important. If changes can be made to the implementation with no observable impacts to clients of the system, tests and specifications which cement that implemenation into the sysem are bad. Maintenance programmers now have a much harder job because they need to discover the long lost intent behind these tests and decide if removing or changing them really does make the system better without changing important behavior detectable to their clients.

    My comment is too long for blogger, so I need to break it up.

  6. The second smell is harder to describe. I'm hoping you can give it a good name, I'll name it "inappropriate or indirect coupling." The invalid input processor must come last. The reason it must come last is because it fails to validate its input. It is unlike the first two processors in that it processes all input. Its implementation of ShouldProcess is not like the others. It is inconsistent and different. This inconsistency is the reason why it must come last. So its position in the chain is coupled to how it implements the ShouldProcess method. It's named the TerminalProcessor but its responsibility in the system is to identify invalid input. The client specification does not provide an order to the rules of the system. So there is a coupling here to the order of the rules and the implementation of ShouldProcess for this processor. Often, the position in the chain is important and a coupling of position to ShouldProcess is fine. In this example, I find it odd and I feel that this processor is implemented and named incorrectly. It behaves correctly and so it just needs a refactoring.

    The third smell results from the second smell. Perhaps there is a good reason for that implementation choice. However, there is no explanation for it. If that choice was not arbitrary but instead purposeful, it should be explained via a comment or some other mechanism.

    Many of my colleagues have philosophical objections to unit tests. They feel that integration tests add real value whereas unit tests add mostly costs to developers without adding much value. Up until this blog post, I feel I could have a reasonable discussion with them and defend the tests written so far. I'm confident I would not be able to convince them to change their position. I know that's unfortunate, but their feelings are strong and their worries about extra work are large. With the prior tests, they would respect my arguments and the tests as I think I can defend the "extra" work created here. However, these tests would get me laughed out of the room. I think the smell is one of "circular reasoning" although I find it hard to put my finger on naming the objection exactly. If the argument for these tests is that they prove that the ordering is as-specified, I'm stuck. For one, as you point out, the code itself is much clearer on that point. For another, the rule orders weren't exactly specified anywhere. Finally, if I got the rule orders incorrect, these tests wouldn't help me discover that I had failed to implement the client specification. Suppose I had accidentally put the TerminalProcessor first in both the test and the product code? No test would fail and yet the system would not work.

  7. It is at this point where I feel TDD calls for an integration test and not a unit test. Up until now, the system was not constructed well enough to implement a specification test validating the client specification. Up until now, we were building the pieces. Now is the point where we can validate that the client specification is correct. At this point, I would implement a test which treats the factory as a black box. Now is the time to test that the system as constructed by the factory behaves as expected. Such a set of tests would not validate the order of the "large-value" rule relative to the "small-value" rule. Nor would it validate the order of the rules at all. I feel that no explicit ordering tests should exist with this example. Second, since there would an integration test for each of the cases, tests would fail if the TerminalProcessor had been placed in the wrong order. There would not be a reliance on an artificial ordering test to find that. Integation tests would eliminate the over-specification smell and the circular reasoning smell. The second and third smells from above are "developer" smells that tests cannot catch and refactorings could fix. Without over-specification, the refactorings would be easier both because there wouldn't be obsolete tests and because we wouldn't have to confirm that those tests weren't really needed in the first place.

    The final set of smells apply to the blog series itself. I'm hoping you intend to discuss the importance of TDD and integration tests. I'm hoping you'll address the importance of working with your clients when developing TDD tests. This entry makes me worry a bit. Some of the best value I have gotten from TDD is when writing integration TDD tests in English first. I then review those tests with my clients. Sometimes my clients have corrected me right then and there and saved me all kinds of trouble. Let's take two examples. In both examples, I was working with a system which already existed and had a good set of specification tests written as integration tests. In one case, my client specified a rule change. I went back to my desk and worked out the changes to the integration tests. I didn't implement the test changes. Instead, I noticed that the rule change resulted in many side effects and a large number of test updates and new tests. Based on my understanding of the client's intention, I thought that none of the side effects were bad but they sure were creating a lot of extra work. After reviewing these with my client, we agreed on a different set of changes to the specification which minimized the side effects and made my client happier. He hadn't foreseen the side-effects either. If I had simply gone back to my desk and written some tests that validated what I had been given, I would have missed out on the value of TDD.

  8. In the second example, my client had specified out the rules pretty clearly and the client specification even included a flow chart of the rules. The ordering of the rules certainly was important. However, the primary concern was the impact on the user scenarios. We had jointly developed the rules in close collaboration to achieve what we hoped was the ideal impact for the user. The rules were heuristic by nature and had subtle effects on the overall result when integrated into the system as a whole. Later when users discovered some problems with specific cases, we knew we needed modifications to the rules. In this case, we modified the rules and the product code first. It was too difficult to work out the impact to the tests due to the complexity of how the rules interacted as a whole. We then jointly examined the results to determine what cases were broken which shouldn't have been. We kept tweaking the rules until we got the result we wanted. Sure we wrote new tests for the new behavior but it was even more important to be able to understand which of the old behavior needed to change to accommodate the new behavior. In this case, the tests were great at helping us identify which rule changes which looked good in isolation didn't look so good in the context of the system as a whole. In the simple case of your example here, it would be similar if the client later specified that a subset of the small-value range should be multipled by four but what he really meant was four times the current result. Only after showing him what I thought the test cases would be would we discover the error. Naturally, your example is much simpler by design.

    In my view, the tests written in this example are not well justified. The focus on unit testing, failure to test the intention of the client specification and the over-specification of the system all result in the circular smell. The circular smell is indicative of a number of problems with this example. I think you identified the circular smell yourself when you acknowledged that there was an awful lot of test for such a small amount of code. If you used this example to switch from TDD unit testing to TDD integration testing, you would still achieve 100% code coverage, you would avoid the circular smell, you would be able to transition to a discussion of how TDD positively impacts the developer and client relatonship. While I am a big fan of TDD, unit tests and the other articles in this series, this article damages the case for unit tests and provides fodder for my unit-testing skeptical colleagues to continue to argue against unit testing. In practice, I would write integration tests here and not unit tests. I would still use TDD.

    PS. Finding this blog from the net objectives site was difficult for me. I knew it was there because I happened to see the annoucement for the blog last year. I apologize for providing this feedback so long after you wrote your blog entry. I didn't read it until yesterday. Also, I'd rather not be so critical as I do really like the prior blog entries and I'm generally a big fan of net objectives training materials. I hope you find the criticism useful and pragmatic.

  9. I guess I would have gone a slightly different way. I would have had a factory that was solely responsible for building the objects (you did that). Presumably, that factory would need to produce at least one terminal processor with nothing that it decorates. I would condition my mock factory to produce a mock processor for the terminal. For the next thing that should be built, I would condition my mock factory to take the terminal in and return a new mock processor. I would have repeated that process until I had specified the entire chain. When I was done, I would invoke a builder against the factory and assert that I got back the item at the head of the chain.

    // setup
    var factory = new Mock();
    var mockTerminal = new Mock();
    var mock1 = new Mock();
    var mock2 = new Mock();
    var mockHead = new Mock();
    factory.Setup(f => f.GetFailProcessor().Returns(mockTerminal.Object);
    factory.Setup(f => f.GetSpecialProcessor(mockTerminal.Object)).Returns(mock1.Object);
    factory.Setup(f => f.GetSpecialProcessor(mock1.Object)).Returns(mock2.Object);
    factory.Setup(f => f.GetSpecialProcessor(mock2.Object)).Returns(mockHead.Object);
    var builder = Builder.GetInstance(factory.Object);

    // trigger
    var actual = builder.GetProcessor();

    // assertion
    Assert.That(actual, Is.SameAs(mockHead.Object));

    Note that this has the property of coupling the test more tightly to what is specified (your chain is built with this particular order) and less tightly to the design of what it is not testing (the interface of Processor).

    1. It swallowed all my angle-bracket-enclosed type names but I'm sure you get the gist.