Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Friday, January 29, 2016

TDD and the "6 Do's and 8 Skills" of Software Development: Pt. 1

This post is not about TDD per se, but rather a context in which TDD can demonstrate its place in and contribution to the value stream.  This context has to do with the 6 things that we must accomplish (do) and the 8 skills that the team must have in order to accomplish them.  We'll describe each "do", noting where and if TDD has an impact, and then do the same thing with the skills.

6 Dos:
  • Do the right thing
  • Do the thing right
  • Do it efficiently
  • Do it safely
  • Do it predictably
  • Do it sustainably

8 Skills:

  • Programming
  • Designing
  • Analysis
  • Refactoring
  • Testing
  • Dev ops
  • Estimation
  • Process Improvement

 

Do the right thing


Everything the team does must be traceable back to business value.  This means “the right thing” is the thing that has been chosen by the business to be the next most important thing, in terms of business value, that we should work on.  TDD has no contribution to make to this.  Our assumption is that this decision has been made, and made correctly before we begin our work.  How the business makes this decision is out of scope for us, and if they make the wrong one we will certainly build the wrong thing.  This is an issue of product portfolio management and business prioritization, and we do not mean to minimize its importance; it is crucial.  But it’s not a TDD activity.  It is the responsibility of project/product management.

An analogy:

As a restaurant owner, the boss has determined that the next thing that should be added to the menu is strawberry cheesecake.  He made this decision based on customer surveys, or the success of his competitors at selling this particular dessert, or some other form of market research that tells him this added item will sell well and increase customer satisfaction ratings.  It will have great business value and, in his determination, is the most valuable thing to have the culinary staff work on.

Do the thing right


One major source of mistakes is misunderstanding.  Communication is an extremely tricky thing, and there can be extremely subtle differences in meaning with even the simplest of words.  “Clip” means to attach (clip one thing to another) and to remove (clipping coupons). 

A joke we like: My wife sent me to the store and said “please get a gallon of milk -- if they have eggs get six.”  So I came back with 6 gallons of milk.  When she asked why I did that, I replied “they had eggs.” 

The best way we know to ferret out the hidden assumptions, different uses of terms, different understanding, missing information, and the all-important “why” of a requirement (which is so often simply missing) is by engaging in a richly communicative collaboration involving developers, testers, and businesspeople.  The process of writing acceptance tests provides an excellent framework for this collaboration, and is the responsibility of everyone in the organization.

The analogy, continued:

You work as a chef in the restaurant, and the owner has told you to add strawberry cheesecake to the menu.  You prepare a graham-cracker crust, and a standard cheesecake base to which you add strawberry syrup as a flavoring.  You finish the dish and invite your boss to try it.  He says “I did not ask for strawberry flavored cheesecake, I asked for a strawberry cheesecake.  Cheesecake with strawberry.”

So you try again, this time making a plain cheesecake base and adding chopped up strawberries, stirring them in.  The boss stops by to sample the product and says “no, no, not strawberries in the cake, I meant on the cake.”

So you try another version where the plain cheesecake is topped by sliced strawberries.  Again the boss in unhappy with the result.  “Not strawberries, strawberry.  As in a strawberry topping.”

What he wanted was a cheesecake topped with strawberry preserves, which he has always thought of as “strawberry cheesecake.”  All this waste and delay could have been avoided if the requirements had been communicated with more detail and accuracy.

Do it efficiently


For most organizations the primary costs of developing software are the time spent by developers and testers doing their work, and the effect of any delays caused by errors in the development process.  Anything that wastes time or delays value must be rooted out and corrected.

TDD has a major role to play here. 
  • When tests are written as the specification that guides development, they keep the team focused on what is actually needed. 
  • The tests themselves require precision in our understanding of a requirement and thus lead to code that satisfies the exact need and nothing more.  Traditionally developers have worked in an environment of considerable uncertainty, and thus have spent time writing code that ends up being unnecessary, which wastes their time. 
  • Without TDD, defects in the code will largely be dealt with after development is over, requiring much re-investigation of the system after the fact.  TDD drives the issue to one of bug prevention (much more time-efficient) rather than bug detection.

 

Do it safely


Software must be able to change if it is to remain valuable, because its value comes from its ability to meet a need of an organization or individual.  Since these needs change, software must change. 

Changing software means doing new work, and this is usually done in the context of existing work that was already completed.  One of the concerns that arises when this is done is: will the new work damage the existing system?  When adding a new feature, for example, we need to guard against introducing bugs in the code that existed before we started our work.

TDD has a significant role here, because all of our work proceeds from tests and thus we have test coverage protecting of our code from accidental changes.  Furthermore, this test coverage is known to be meaningful because of how it was achieved.

Test coverage that is added after a system is created is only guaranteed to execute the production code, but not to guarantee anything about the behavior that results from the execution.  In TDD the coverage is created by writing tests that drive the creation of the behavior, so if they continue to pass we can be assured that the behavior remains the same.

Do it predictably


A big part of success in business is planning effectively, and this includes the notion of predictability.  Every development initiative is either about creating something new, or changing something that already exists (and, in fact, you could say that creating something new is just a form of change: from nothing to something).

One question we seek to answer when planning and prioritizing work is: how long will it take and how many resources will be required?  Although we know we can never perfectly predict these things, we want to reduce the degree of error in our predictions.

TDD has a role to play here:
  • TDD increases design and code quality.  There are many reasons for this, but the shorthand explanation is that bad designs and poor code are very hard to test.  If we start from the testing perspective, we tend to create more quality.  Higher quality creates clarity, and the more clarity you have the better your predictions will be.
  • TDD points out gaps in analysis earlier than traditional methodologies.  These gaps, when discovered late, create unexpected/unplanned for work, and this derails our predictions.
  • TDD provides meaningful code coverage.  This reduces the creation of unexpected errors, and fewer unexpected anything increases predictability.
  • TDD helps us to retain knowledge, and the more you understand a thing the more accurate your predictions will be about changing it.
  •  

Do it Sustainably


The team must work in a way that can be sustained over the long haul.  Part of this is avoiding overwork and rework, and making sure the pace of work is humane.  Part of this is allowing time for the team to get appropriate training, and thus to "sharpen the saw" between major development efforts.  Issues like these are the responsibility of management whether the team is practicing TDD or not.

However, this work is called "Sustainable Test-Driven Development" for a reason.  TDD itself can create sustainability problems if the maintaining the test suite presents an increasingly-significant burden for the team.  Much of our focus overall has been and will continue to be avoiding this problem.

In other words, TDD will not create sustainability unless you learn how to do it right.
Next up, How TDD impacts the 8 skills of software development

Friday, December 11, 2015

Specifying The Negative in TDD

One of the issues that frequently comes up is "how do I write a test about a behavior that the system is specified not to have?"  It's an interesting question given the nature of unit tests.  Let's examine it.

The Decision Tree of Negatives


When it comes to behaviors that the system should not have, there are different ways that this can be specified and ensured for the future:

Inherently Impossible


Some things are inherently impossible, depending on the technology being used.  For example you cannot write to read-only memory.  This is in the nature of the memory and thus does not require a specification (nor a test, since that would be a test that could never fail).  In languages like C# and Java, there exists the concept of “private”, and we know that an attempt to read or write a private value from outside a class will not compile and so will never exist in the executable system. 

Some things are inherently impossible and cannot be made possible even accidentally.  Read-only memory cannot be made writable.  However other things which are impossible by nature can be made possible if desired.  A good example of this is an immutable object.

Let's say there exists in our system a SaleAmount class that represents an amount of money for a given retail sale in an online environment.  Such a class might exist in order to restrict, validate, or perfect the data it holds.  In this case, however, there is a customer requirement that the value held must be immutable, for reasons of security and consistency in their transactions. 

This brings up the question "how do I specify in a test that you cannot change the value?"
How can we test-drive such an entity when part of what we wish to specify is that the value, once established in an instance of this class, cannot be changed from the outside?  A typical way this questions is stated is "how can I show, in a test, that there is no SetValue() method?  Any test that references such a method simply will not compile because it does not exist.  Therefore, I cannot write the test.”

Developers will sometimes suggest two different ideas:
  1. Add the SetValue() method, but make it throw an exception if anyone ever calls it.  Write a test that calls this method and fails if the exception is not thrown.[1]  Sometimes other actions are suggested if the method gets called, but an exception is quite common.
  2. Use reflection in the test to examine the object and, if SetValue() is found, fail the test.

The problem with option #1 is that this is not what the requirement says, it is not what was wanted.  The specification should be "you cannot change the value" not "if you change the value, thing x will happen."  So here, the developer is creating his own specification and ignoring the actual requirements.

The problem with option #2 is twofold:  First, reflection is typically a very sluggish thing and in TDD we want our tests to be extremely fast so that we can run them frequently without this slowing down our process.  But even if we overcame that somehow, what would we have the test look for?  SetValue()PutValue()ChangeValue()AlterValue()? The possibilities are vast and the cost of fully verifying immutability, in this case, would be enormous compared to the value.

The key to solving this is in reminding ourselves once again that TDD is not initially about testing but creating a specification.  Developers have always worked from some form of specification it's just that the form was usually some kind of document.

So think about the traditional specification, the one you're likely more familiar with.  Ask yourself this: Does a specification indicate everything the system does not do?  Obviously not, for this would create a document of infinite length.  Every system does a finite set of things, and then there is an infinite set of things it does not do.

For example, here is an acceptance test for the positive requirement [2]:

Given: A SaleAmount S with value V
When: You ask for the value of S
Then: V is retrieved

This could be made into an executable specification by the following simple test:

[TestClass]
public class SaleAmountTest
{
    [TestMethod]
    public void TestSaleAmountPersistence()
    {
        var initialValue = 10.50d;
        var testDollar = new SaleAmount(
initialValue);

        var retrievedValue = testDollar.GetValue();

        Assert.AreEqual(retrievedValue, initialValue);
    }
}


Which would drive the entity and its behavior into existence:

public class SaleAmount
{
    private double myValue;
    public SaleAmount(double aValue)
    {
        myValue = aValue;
    }

    public double GetValue()
    {
        return myValue;
    }
}


Ask yourself the following question:  If we were using the TDD process to create this SaleAmount object, and if the object had a method allowing the value to be changed (SetValue() or whatever), how would it have gotten there?  Where is the test that drove that mechanism into existence?  It's not there because there is a specific requirement that it not be there.  In TDD we never add code to the system without having a failing test first, and we only add the code that is needed to make the test pass, and nothing more. 

Put another way, if a developer on our team added a method that allowed such a change, and did not have a failing test written first, then he would be ignoring the rules of TDD and would be creating a bug as a result.  TDD does not work if you don't do it.  We don't know of any process that does. 

And if we think back to the concept of a specification there is an implicit rule here, which basically has two parts.

1.    Everything the system does, every behavior, must be specified.
2.    Given this, anything that is not specified is by default specified as not a behavior of the system. 

If it is a behavior nonetheless it is a defect.

 

Inherently possible


We don’t have a test that shows the value being changed, so it cannot be.  But this does not mean we have a “test for immutability.”  Anything that comes from the customer must be retained; we never want to lose that knowledge.  So if we think of this requirement in terms of acceptance testing we could express it using the ATDD nomenclature:

Given: A SaleAmount S with value V exists in the system
Then: You cannot change V

There is no “When” in this case because this is a requirement that is always true, it is not based on system state.  But this, of course, implies a strongly-typed, compiled language with access-control idioms (like making things "private" and so forth).  What if your technology does not provide this?  What if it is an interpreted language, or one with no enforcement mechanism to prevent access to internal variables?

The first answer is: You have to ask the customer.  You have to tell them that you cannot do precisely what they are asking for, and consider other alternatives in that investigation.   It may well be that we are using the wrong technology.

The second answer is that there will be some occasions where the only way you can ensure that an illegal or unwanted behavior is not added to a system accidentally is through static analysis (a traditional code review, or perhaps a code analysis tool).  This is still “a test” but one that either cannot or should not be automated in all cases.

On the other hand, sometimes we can make an inherently possible thing impossible by adding behaviors.  Such behaviors must, of course, be test driven.

Let's add a requirement to our SaleAmount class.  If the context of this object was, say, an online book store, the customer might have a maximum amount of money that he allows to be entered into a transaction.

We used a double-precision number [3] to hold the value in SaleAmount. A double can hold an incredibly large value inherently.  In .net, for example, it can hold a value as high as 1.7976931348623157E+308 [4].  It does not seem credible that any purchase made at our customer's site could total up to something like that!  So the requirement is: Any SaleAmount object that is instantiated with a value greater than the customer's maximum credible value should raise a visible alarm, because this probably means the system is being hacked or has a very serious calculation bug.

As developers, we know a good way to raise an alarm is to throw an exception.  We can do that, but we also capture the customer's view of what the maximum credible value is, so we specify it.  Let's say he says "nothing over $1,000.00 makes any sense".  But... how much "over"?  A dollar?  A cent?  We have to ask, of course.  Let's say the customer says "one cent".

In TDD everything must be specified, all customer rules, behaviors, values, everything.  So we start with this:

Given: The system
Then: The Maximum value for a Sale Amount is $1000.00

We also have to capture the tolerance in its own specification:

Given: The System
Then: Tolerance for comparing SaleAmount to its Maximum is one cent

These tests establish bits of domain-specific language that can then be used in any number of other specifications (we won’t have to repeatedly define them whenever we make comparisons).

[TestMethod]
public void SpecifyMaximumDollarValue()
{
    Assert.AreEqual(1000d, SaleAmount.MAXIMUM);
}

[TestMethod]
public void SpecifyMaximumDollarValue()
{
    Assert.AreEqual(.01, SaleAmount.TOLERANCE);
}


In order to get these to pass we drive the Maximum and the Tolerance into the system.
Now we can write this test, which will also fail initially of course:

Given: Value S greater than or equal to Maximum + Tolerance
When: An attempt is made to create a SaleAmount with value S
Then: A warning is issued

[TestMethod]
public void TestUSDollarThowsUSDollarValueTooLargeException()
{
    var saleAmountMaximum = SaleAmount.MAXIMUM;
    var tolerance = SaleAmount.TOLERANCE;
    var excessiveAmount = saleAmountMaximum + tolerance;

    try
    {
        CreateSaleAmount(excessiveAmount);
        Assert.Fail("SaleAmount created with excessive"+"
                    "
value should have thrown an exception");
    }
    catch (SaleAmountValueTooLargeException)
    { }
}


But now the question is, what code do we write to make this test pass?  The temptation would be to add something like this to the constructor of SaleAmount:

if(aValue => MAXIMUM + TOLERANCE) 
          throw new SaleAmountValueTooLargeException();

But this is a bit of a mistake.  Remember, it's not just "add no code without a failing test", it is "add only the needed code to make the failing test pass."

Your spec is supposed to be your pal.  He's supposed to be there at your elbow saying "don't worry.  I won't let you make a mistake.  I won't let you write the wrong code, I promise."  He's not just your pal, he's your best pal. 

Here, however, the spec is just a mediocre friend because he will let you write the wrong code and say nothing about it.  He’ll let you get in your car when you are in no condition to drive.  He'll let you do this, and let it pass:

throw new SaleAmountValueTooLargeException();

There is no conditional.  We’re just throwing the exception all the time.  That's wrong, obviously. This behavior has a boundary (as we discussed in our blog about test categories) and every boundary has two sides.  We need a little more in specification.  We need something like this:

try
{
    new SaleAmount(SaleAmount.MAXIMUM);
}
catch (SaleAmountValueTooLargeException)
{
    Assert.Fail("SaleAmount created with value at the maximum"+
                "should not have thrown an exception");
}


Now the "anAmount => MAXIMUM + TOLERANCE" part must be added to the production code or your best buddy will let you know you're blowing it.  Friends don’t let friends implement incorrectly. 
...
[1] There are a variety of ways to do this.  We’ll show one way here a bit further on.
[2] [TODO] Link to ATDD blog
[3] If you’re thinking “you used the wrong type, a long would be better” it’s a fair point.  We simply wanted to make the conceptual point that primitives do not impose domain constraints inherently, and the use of the double just makes the idea really clear.
[4] For those who dislike exponential notation, this is:
$179,769,313,486,231,520,616,720,392,992,464,536,472,240,560,432,240,240,944,616,576,160,448,992,408,768,712,032,320,616,672,472,536,248,456,776,672,352,088,672,544,960,568,304,616,280,032,664,704,344,880,448,832,696,664,856,832,848,208,048,648,264,984,808,584,712,312,912,080,856,536,512,272,
952,424,048,992,064,568,952,496,632,264,936,656,128,816,232,688,512,496,536,552,712,648,144,200,160,624,560,424,848,368
...and no cents. :)


Wednesday, November 4, 2015

Structure of Tests-As-Specifications

A big part of our thesis is that TDD is not really a testing activity, but rather a specifying activity that generates tests as a very useful side effect.  For TDD to be a sustainable process, it is important to understand the various implications of this distinction. [1]

Here, we will discuss the way our tests are structured when we seek to use them as the functional specification of the system.

A question we hear frequently is "how does TDD relate to BDD?"  BDD is "Behavior-Driven Development" a term coined by Dan North and Chris Matts in their 2006 article "Introducing BDD" [2].  Many have made various distinctions between TDD, ATDD, and BDD, but we feel these distinctions to be largely unimportant.  To us, TDD is BDD, except that we conduct the activity at a level very close to the code, and automation is much more critical. Also, we contend that “development” includes analysis and design, and thus what TDD enables is more accurately stated to be “behavior-based analysis and design”, or BBAD.

In BBAD, the general idea is that the "unit" of software that is being specified is a behavior.  Software is behavior, after all.  Software is not a noun, it is a verb.  Software’s value lies entirely in what it does, what value the user accrues as result of its behavior.  In essence, software only exists in any meaningful sense of the word when it is up and running.  The job of a software development team is to take a general-purpose computer and cause it to act in specific, valuable ways.  We call these behaviors.

The nomenclature that North and Matts proposed for specifying each behavior of a system is this: Given-When-Then.  Here's a simple example:

Given:
     User U has a valid account on our system with Username UN and password PW
     The login username is set to UN and the login password is set to PW
When:
    Login is requested
Then:
    U is logged in

Everything that software does, every behavior can be expressed in this fashion.  Each Given-When-Then expression is a specific scenario that is deemed to have business value, and that the team has taken upon itself to implement.

In TDD, when the scenario is interpreted at a test, we strive to make this scenario actionable.  So we think of these three parts of the scenario a little differently, we "verbify" them to convert these conditions into activities.

Imagine that you were a manual tester that was seeking to make sure the system was behaving correctly in terms of the scenario above.  You would not wait around until a user with a valid account happened to browse to the login page, enter his info, and click the "Login" button... you would create or identify an existing valid user and, as that person, browse to the page, enter the correct username and password, and then click the button yourself. Then you'd check to see if your login was successful.  You would do all of these things.

So the Given wasn't given, it was done by the tester (you, in this case), the When was not when, it was now do, and the Then was not a condition but rather an action: go and see if things are correct.

"Given" becomes "Setup".
"When" becomes "Trigger".
"Then" become "Verify".

We want to structure our tests in such a way that these three elements of the specification are clear and, as much as possible, separate from each other.  Typical programming languages can make this a bit challenging at times, but we can overcome these problems fairly easily.

For example: Let's say we have a behavior that calculates the arithmetic mean of two real numbers accurate within 0.1. Most likely this will be a method call on some object that takes two values as parameters and returns their arithmetic mean of those values, accurate within 0.1.

Let’s start with the Given-When-Then:

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The arithmetic mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

Let's look at a typical unit test for such a behavior:

(Code samples are in C# with MSTest as the testing framework)

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        Assert.AreEqual(5.5d,
                        MathUtils.GetInstance().

                        ArithmeticMean(7.0d, 4.0d),.1);
    }
}



This test is simple because the behavior is simple.  But this is really not great as a specification.

The Setup (creation of the MathUtils object, the creation of the example doubles 7.0d and 4.0d), the Trigger (the calling of the ArithmeticMean method with our two examples doubles), and the Verify (comparing the method's return to the expectation, 5.5d, and establishing the precision as .1), are all expressed together in the assertion.  If we can separate them, we can make the specification easier to read and also make it clear that some of these particular values are not special, that they were just picked as convenient examples.

This is fairly straightforward, but easy to miss:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {         
        // Setup
        var mathUtils = MathUtils.GetInstance();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue)/2;

        // Trigger
        var actualMean = mathUtils.ArithmeticMean(anyFirstValue,
                                                  anySecondValue);

        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }
}


Here we have included comments to make it clear that the three different aspect of this behavioral specification are now separate and distinct from each other.   The "need" for comments always seems like a smell, doesn't it?  It means we can still make this better.

But we've also used variable names like "anyFirstValue" to indicate that the number we chose was not a significant value, creating more clarity about what is important here.  Note that tolerance and expectedMean were not named in this way, because their values are specific to the required behavior.

This, now, is using TDD to form a readable specification, which also happens to be executable as a test [2].  Obviously the value of this as a test is very high; we do not intend to trivialize this.  But we write them with a different mindset when we think of them as specifications and, as we'll see, this leads to many good things.

Looking at both code examples above however, some of you may be thinking "what is this GetInstance() stuff?  I would do this: "

        // Setup
        var mathUtils = new MathUtils();

Perhaps.  We have reasons for preferring our version, which we'll set aside for its own discussion.

But the interesting question is: what if you started creating the object one way (using “new”), and then later changed your mind and used a static GetInstance() method, or maybe even some factory pattern?  If, when that change was made, you had many test methods on this class doing it the "old" way this would require the same change in all of them.

We can do it this way instead:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var arithMeticMeanCalculator =
                           GetArithmeticMeanCalculator();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue) / 2;

        // Trigger
        var actualMean = arithMeticMeanCalculator.
                         ArithmeticMean(anyFirstValue,
                                        anySecondValue);
        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }
}



Now, no matter how many test methods on this test class needed to access this arithmetic mean behavior (for different scenarios), a change in terms of how you access the behavior would only involve the modification of the single "helper" method that is providing the object for all of them.

Many testing frameworks have their own mechanisms for eliminating redundant object creation, usually in the form of a Setup() or Initialize() method, etc., and these can be used. But we prefer the method because we then gain the ability to decouple the specification from the fact that the behavior we’re specifying happens to be implemented in a class called MathUtils.  We could also change this design detail and the impact would only be on the helper method (the fact that C# has a var type is a real plus here… you might be limited a bit in other languages)

But the spec is also not about the particular method you call to get the mean, just how the calculation works, behaviorally.  Certainly an ArithmeticMean() method is logical, but what if we decided to make it more flexible, allowing any number of parameters rather than just two?  The meaning of "arithmetic mean" would not change, but our spec would have to.  Which seems wrong.  So, we could take the idea a little bit farther:

[TestClass]
public class MathTests
{
    [TestMethod]
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;
        var expectedMean = (anyFirstValue + anySecondValue) / 2;

        // Trigger
        var actualMean = TriggerArithmeticMeanCalculator(
                         arithmeticMeanCalculator, 
                         anyFirstValue, anySecondValue);
        // Verify
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private double TriggerArithmeticMeanCalculator(MathUtils mathUtils, 
                                                  double anyFirstValue, 
                                                  double anySecondValue)
    {
        return mathUtils.ArithmeticMean(anyFirstValue,
            anySecondValue);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }


Now if we change the ArithmeticMean() method to take a container rather than discrete parameters, or whatever, then we only change this private helper method and not all the various specification-tests that show the behavior with more parameters, etc...

The idea here is to separate the meaning of the specification from the way the production code is designed.  We talk about the specification being one thing, and the "binding" being another.  The specification should change only if the behavior changes.  The binding (these private helpers) should only change if the design of the system changes.

Another benefit here is clarity, and readability.  Let's improve it a bit more:

[TestClass]
public class MathTests
{
    [TestMethod] 
    public void TestArithmeticMeanOfTwoValues()
    {
        // Setup
        var anyFirstValue = 7.0;
        var anySecondValue = 4.0;
        var tolerance = .1;

        // Trigger
        var actualMean = TriggerArithmeticMeanCalculation(
                                             anyFirstValue,  '  
                                             anySecondValue);
           
        // Verify
        var expectedMean = (anyFirstValue + anySecondValue) / 2;
        Assert.AreEqual(expectedMean, actualMean, tolerance);
    }

    private double TriggerArithmeticMeanCalculation(
                                double anyFirstValue, 
                                double anySecondValue)
    {
        var arithmeticMeanCalculator = GetArithmeticMeanCalculator();
        return arithmeticMeanCalculator.
                                ArithmeticMean(anyFirstValue, 
                                anySecondValue);
    }

    private MathUtils GetArithmeticMeanCalculator()
    {
        return MathUtils.GetInstance();
    }
}

We have moved the call GetArithmeticMeanCalculator() to the Trigger, and expectedMean to the Verification [3].  Also we changed the notion of "trigger the calculator" to "trigger the calculation". Now, remember the original specification?

Given:
     Two real values R1 and R2
     Required accuracy A is 0.1
When:
     The Arithmetic Mean of R1 and R2 is requested
Then:
     The return is (R1+R2)/2, accurate to A

The unit test, which is our specification, very closely mirrors this Given-When-Then expression of the behavior. Do we really need the comments to make that clear?  Probably not.  We’ve created a unit test that is a true specification of the behavior without coupling it to the specifics of how the behavior is expressed by the system.

Can we take this even further?  Of course... but that's for another entry. :)

[1] It should be acknowledged that Max prefers to say "it is a test which also serves as a specification."  We'll probably beat him into submission :), but for the time being that's how he likes to think of it.  We welcome discussion, as always.

[2] Better Software Magazine, March 2006.

[3] It should also be acknowledged that we're currently discussing the relative merits of using Setup/Trigger/Verify in TDD rather than just sticking with Given/When/Then throughout. See Grzegorz Gałęzowski's very interesting comment below on this (and other things). 

Wednesday, September 23, 2015

TDD and Its (at least) 5 Benefits

Many developers have concerns about adopting test-driven development, specifically regarding:
  • It's more work.  I'm already over-burdened and now you're giving me a new job to do.
  • I'm not a tester.  We have testers for testing, and they have more expertise than I do.  It will take me a long time to learn how to write tests as well as they do.
  • If I write the code, and then test it, the test-pass will only tell me what I already know: the code works.
  • If I write the test before the code the failing of the test will only tell me what I already know: I have not written the code yet.
Here we are going to deal with primarily the first one:  It's going to add work.

This is an understandable concern, at least at initially, and it is not only the developers that express it.  Project managers will fear that the team's productivity will decrease, which they are accountable for.  Project sponsors fear that the cost of the project will go up if the developers end up spending a fair amount of their time writing tests.  The primary cost of creating software is developer time.

The fact is, TDD is not about adding new burdens to the developers, but rather it is just the opposite: TDD is about gaining multiple benefits from a single activity.

In the test-first activity developers are not really writing tests.  They look like tests, but they are not (yet).  They are an executable specification (this is a critical part of our redefinition of TDD entry).  As such, they do what specifications do: they guide the creation of the code.  Traditional specifications, however, are usually expressed in some colloquial form, perhaps a document and/or some diagrams.  Communication in this form can be very lossy and easy to misinterpret.  Missing information can go unnoticed.

For example, one team decided to create a poker game as part of their training on TDD.  Often an enjoyable project is good when learning as we tend to retain information better when we're having a good time.  Also, these developers happened to live and work in Los Vegas. :) Anyway, it was a contrived project and so the team came up with the requirements themselves; basically the rules of poker and the mechanics of the game.  One requirement they came up with was "the system should be able to shuffle the deck of cards into a reordered state."  That seemed like a reasonable thing to require until they tried to write a test for it.  How does one define "reordered?"  One developer said "oh, let's say at least 90% of the cards need to be in a new position after the shuffle completes."  Another developer smiled and said "OK, just take the top card and put in on the bottom.  100% will be in a new position.  Will that be acceptable?"  They all agreed it would not.  This seemingly simple issue ended up being more complicated than anyone had anticipated.

In TDD we express the specification in actual test code, which is very unforgiving.  One of the early examples of this for us was the creation of a Fahrenheit-to-Celsius temperature conversion routine.  The idea seemed simple: take a measurement in Fahrenheit (say 212 degrees, the boiling point of water at sea level), and convert it to Celsius (100 degrees).  That statement seems very clear until you attempt to write a unit test for it, and realize you do not know how accurate the measurements should be.  Do we include fractional degrees?  To how many decimal places?  And of course the real question is what is this thing going to be used for?  This form of specification will not let you get away with not knowing because code is exacting like this.

Put another way, a test would ask "how accurate is this conversion routine?"  A specification asks "how accurate does this conversion routine need to be" which is of course a good question to ask before you attempt to create it.

The first benefit of TDD is just this: it provides a very detailed, reliable form of something we need to create anyway, a functional specification.

Once the code-writing beings, this test-as-specification serves another purpose.  Once we know what needs to be written, we can begin to write it with a clear indication of when we will have gotten it done.  The test stands as a rubric against which we measure our work.  Once it passes, the behavior is correct.  Developers quickly develop a strong sense of confidence in their work once they experience this phenomenon, and of course confidence reduces hesitancy and tends to speed us up.

The second benefit of TDD is that it provides clear, rapid feedback to the developers as they are creating the product code.

At some point, we finish our work.  Once this happens the suite of tests that we say are not really tests (but specifications) essentially "graduate" into their new life: as tests, in the traditional sense.  This happens with no additional effort from the developers.  Tests in the traditional sense are very good to have around and provide three more benefits in this new mode...

First, they guard against code regression when refactoring.  Sometimes code needs to be cleaned up either because it has quality issues (what we call "olfactoring"[1]), or because we are preparing for a new addition to the system and we want to re-structure the existing code to allow for a smooth introduction of the enhancement.  In either case, if we have a set of tests we can run repeatedly during the refactoring process, then we can be assured that we have not accidentally introduced a defect.  Here again, the confidence this yields will tend to increase productivity.

The third benefit is being able to refactor existing code in a confident and reassured fashion.

But also, they provide this same confirmation when we actually start writing new features to add to an existing system.  We return to test-as-specification when writing the new features, with the benefits we've already discussed, but also the older tests (as they continue to pass) tell us that the new work we are doing is not disturbing the existing system. Here again, allows us to be more aggressive in how we integrate the newly-wanted behavior.

The fourth benefit is being able to add new behavior in this same way.

But wait, there's more!  Another critical issue facing a development team is preventing the loss of knowledge.  Legacy code often has this problem:  the people who designed and wrote the systems are long gone, and nobody really understands the code very well.  A test suite, if written with this intention in mind, can capture knowledge because we can consider it any time to be "the spec" and read it as such. 

There are actually three kinds of knowledge we need to retain.
  1. What is the valuable business behavior that is implemented by the system?
  2. What is the design of the system?  Where are things implemented?
  3. How is the system to be used?  What examples can we look at? 
All of this knowledge is captured by the test suite, or perhaps more accurately, the specification suite.  It has the advantage over traditional documentation of being able to be run against the system to ensure it is still correct.

So the fifth benefit is being able to retain knowledge in a trustworthy form.

Up do this point we've connected TDD to several critical aspects of software development:
  1. Knowing what to build (test-first, with the test failing)
  2. Knowing that we built it (turning the test green)
  3. Knowing that we did not break it when refactoring it (keeping the test green)
  4. Knowing that we did not break it when enhancing/tuning/extending/scaling it (keeping the test green)
  5. Knowing, even much later, what we built (reading the tests after the fact)

All of this comes from one effort, one action.

And here's a final, sort of fun one:  Have you ever been reviewing code that was unfamiliar to you... perhaps written by someone else or even by you a long time ago, and you come across a line of code that you cannot figure out.  "Why is this here?   What is it for?  What does it do?  Is it needed?"  One can spend hours poring over the system, or trying to hunt down the original author who may herself not remember.  It can be very annoying and time-consuming.

If the system was created using TDD, this problem is instantly solved.  Don't know what a line of code does?  Break it, and run your tests.  A test should fail.  Go read that test.  Now you know.

Just don't forget to Crtl-Z. :)

But what if no test fails?  Or more than one test fails?  Well, that's why you're reading this blog.  For TDD to provide all these benefits, you need to do it properly...

[1] We'll add a link here when we've written this one