Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Wednesday, November 4, 2015

Structure of Tests-As-Specifications

A big part of our thesis is that TDD is not really a testing activity, but rather a specifying activity that generates tests as a very useful side effect.  For TDD to be a sustainable process, it is important to understand the various implications of this distinction. [1]

Here, we will discuss the way our tests are structured when we seek to use them as the functional specification of the system.

A question we hear frequently is "how does TDD relate to BDD?"  BDD is "Behavior-Driven Development" a term coined by Dan North and Chris Matts in their 2006 article "Introducing BDD" [2].  Many have made various distinctions between TDD, ATDD, and BDD, but we feel these distinctions to be largely unimportant.  To us, TDD is BDD, except that we conduct the activity at a level very close to the code, and automation is much more critical. Also, we contend that “development” includes analysis and design, and thus what TDD enables is more accurately stated to be “behavior-based analysis and design”, or BBAD.

In BBAD, the general idea is that the "unit" of software that is being specified is a behavior.  Software is behavior, after all.  Software is not a noun, it is a verb.  Software’s value lies entirely in what it does, what value the user accrues as result of its behavior.  In essence, software only exists in any meaningful sense of the word when it is up and running.  The job of a software development team is to take a general-purpose computer and cause it to act in specific, valuable ways.  We call these behaviors.

The nomenclature that North and Matts proposed for specifying each behavior of a system is this: Given-When-Then.  Here's a simple example:

Given:
     User U has a valid account on our system with Username UN and password PW
     The login username is set to UN and the login password is set to PW
When:
    Login is requested
Then:
    U is logged in

Everything that software does, every behavior can be expressed in this fashion.  Each Given-When-Then expression is a specific scenario that is deemed to have business value, and that the team has taken upon itself to implement.

In TDD, when the scenario is interpreted at a test, we strive to make this scenario actionable.  So we think of these three parts of the scenario a little differently, we "verbify" them to convert these conditions into activities.

Imagine that you were a manual tester that was seeking to make sure the system was behaving correctly in terms of the scenario above.  You would not wait around until a user with a valid account happened to browse to the login page, enter his info, and click the "Login" button... you would create or identify an existing valid user and, as that person, browse to the page, enter the correct username and password, and then click the button yourself. Then you'd check to see if your login was successful.  You would do all of these things.

So the Given wasn't given, it was done by the tester (you, in this case), the When was not when, it was now do, and the Then was not a condition but rather an action: go and see if things are correct.

"Given" becomes "Setup".
"When" becomes "Trigger".
"Then" become "Verify".

We want to structure our tests in such a way that these three elements of the specification are clear and, as much as possible, separate from each other.  Typical programming languages can make this a bit challenging at times, but we can overcome these problems fairly easily.

For example: Let's say we have a behavior that calculates the arithmetic mean of two real numbers accurate within 0.1. Most likely this will be a method call on some object that takes two values as parameters and returns their arithmetic mean of those values, accurate within 0.1.

Let’s start with the Given-When-Then:

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The arithmetic mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

Let's look at a typical unit test for such a behavior:

(Code samples are in C# with MSTest as the testing framework)

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        Assert.AreEqual(5.5d,
                        MathUtils.GetInstance().

                        ArithmeticMean(7.0d, 4.0d),.1);
    }
}



This test is simple because the behavior is simple.  But this is really not great as a specification.

The Setup (creation of the MathUtils object, the creation of the example doubles 7.0d and 4.0d), the Trigger (the calling of the ArithmeticMean method with our two examples doubles), and the Verify (comparing the method's return to the expectation, 5.5d, and establishing the precision as .1), are all expressed together in the assertion.  If we can separate them, we can make the specification easier to read and also make it clear that some of these particular values are not special, that they were just picked as convenient examples.

This is fairly straightforward, but easy to miss:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {         
        // Setup
        var mathUtils = MathUtils.GetInstance();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue)/2;

        // Trigger
        var actualMean = mathUtils.ArithmeticMean(anyFirstValue,
                                                  anySecondValue);

        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }
}


Here we have included comments to make it clear that the three different aspect of this behavioral specification are now separate and distinct from each other.   The "need" for comments always seems like a smell, doesn't it?  It means we can still make this better.

But we've also used variable names like "anyFirstValue" to indicate that the number we chose was not a significant value, creating more clarity about what is important here.  Note that tolerance and expectedMean were not named in this way, because their values are specific to the required behavior.

This, now, is using TDD to form a readable specification, which also happens to be executable as a test [2].  Obviously the value of this as a test is very high; we do not intend to trivialize this.  But we write them with a different mindset when we think of them as specifications and, as we'll see, this leads to many good things.

Looking at both code examples above however, some of you may be thinking "what is this GetInstance() stuff?  I would do this: "

        // Setup
        var mathUtils = new MathUtils();

Perhaps.  We have reasons for preferring our version, which we'll set aside for its own discussion.

But the interesting question is: what if you started creating the object one way (using “new”), and then later changed your mind and used a static GetInstance() method, or maybe even some factory pattern?  If, when that change was made, you had many test methods on this class doing it the "old" way this would require the same change in all of them.

We can do it this way instead:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var arithMeticMeanCalculator =
                           GetArithmeticMeanCalculator();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue) / 2;

        // Trigger
        var actualMean = arithMeticMeanCalculator.
                         ArithmeticMean(anyFirstValue,
                                        anySecondValue);
        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }
}



Now, no matter how many test methods on this test class needed to access this arithmetic mean behavior (for different scenarios), a change in terms of how you access the behavior would only involve the modification of the single "helper" method that is providing the object for all of them.

Many testing frameworks have their own mechanisms for eliminating redundant object creation, usually in the form of a Setup() or Initialize() method, etc., and these can be used. But we prefer the method because we then gain the ability to decouple the specification from the fact that the behavior we’re specifying happens to be implemented in a class called MathUtils.  We could also change this design detail and the impact would only be on the helper method (the fact that C# has a var type is a real plus here… you might be limited a bit in other languages)

But the spec is also not about the particular method you call to get the mean, just how the calculation works, behaviorally.  Certainly an ArithmeticMean() method is logical, but what if we decided to make it more flexible, allowing any number of parameters rather than just two?  The meaning of "arithmetic mean" would not change, but our spec would have to.  Which seems wrong.  So, we could take the idea a little bit farther:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue) / 2;

        // Trigger
        var actualMean = TriggerArithmeticMeanCalculator(
                         arithmeticMeanCalculator, 
                         anyFirstValue, anySecondValue);
        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private double TriggerArithmeticMeanCalculator(MathUtils mathUtils, 
                                                  double anyFirstValue, 
                                                  double anySecondValue)
    {
        return mathUtils.ArithmeticMean(anyFirstValue,
            anySecondValue);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }


Now if we change the ArithmeticMean() method to take a container rather than discrete parameters, or whatever, then we only change this private helper method and not all the various specification-tests that show the behavior with more parameters, etc...

The idea here is to separate the meaning of the specification from the way the production code is designed.  We talk about the specification being one thing, and the "binding" being another.  The specification should change only if the behavior changes.  The binding (these private helpers) should only change if the design of the system changes.

Another benefit here is clarity, and readability.  Let's improve it a bit more:

[TestClass]
public class MathTests
{
    [TestMethod] 
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;

        // Trigger
        var actualMean = TriggerArithmeticMeanCalculation(
                                             anyFirstValue,  '  
                                             anySecondValue);
           
        // Verify
        var expectedMean = (anyFirstValue + anySecondValue) / 2;
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private double TriggerArithmeticMeanCalculation(
                                double anyFirstValue, 
                                double anySecondValue)
    {
        var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
        return arithmeticMeanCalculator.
                                ArithmeticMean(anyFirstValue, 
                                anySecondValue);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }
}

We have moved the call GetArithmeticMeanCalculator() to the Trigger, and expectedMean to the Verification [3].  Also we changed the notion of "trigger the calculator" to "trigger the calculation". Now, remember the original specification?

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The Arithmetic Mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

The unit test, which is our specification, very closely mirrors this Given-When-Then expression of the behavior. Do we really need the comments to make that clear?  Probably not.  We’ve created a unit test that is a true specification of the behavior without coupling it to the specifics of how the behavior is expressed by the system.

Can we take this even further?  Of course... but that's for another entry. :)

[1] It should be acknowledged that Max prefers to say "it is a test which also serves as a specification."  We'll probably beat him into submission :), but for the time being that's how he likes to think of it.  We welcome discussion, as always.

[2] Better Software Magazine, March 2006.

[3] It should also be acknowledged that we're currently discussing the relative merits of using Setup/Trigger/Verify in TDD rather than just sticking with Given/When/Then throughout. See Grzegorz Gałęzowski's very interesting comment below on this (and other things).