Sustainable Test-Driven Development: TDD and Defects

We've said all along that TDD is not really about "testing" but rather about creating an executable form of specification that drives development forward. This is true, and important, but it does not mean that TDD does not have a relationship to testing. One interesting issue where there is significant synergy is in our relationship to defects.

Two important issues we'll focus on are: when/how a defect becomes known to us, and the actions we take at that point.

Time and Development

In the cyclic nature of agile development, we repeatedly encounter various points in time when we may discover that something is not right. First, as we are writing the source code itself most modern tools can let us know that something is not the way we intended it to be. For example when you end a method with a closed-curly-brace a good IDE will underline or otherwise highlight any temporary method variables that you created but never used. Obviously if you created a variable you intended to use it so you must have done something other than you meant to. Or, if you type an object reference name and then hit the dot, many IDE's will bring up a list of methods available for you to call on that type. If the list does not appear then something is not right.

When compiling the source into the executable we encounter a number of points in time where the technology can check our work. The pre-compiler (macros, if-defs, #defines), the compiler, the linker (resolving dependencies), and so forth.

And there are run-time checks too. The class loader, generic type constraints, assertions of preconditions and postconditions, etc.. Various languages and technologies provide different levels of these services and they all can be "the moment" where we realize that we made an error that has resulted in a defect.

Detection vs. Prevention

Defects are inevitable and so we have to take action to either detect them or to prevent them. Let's say for example that you have a method that takes as its parameters the position of a given baseball player on a team, and his jersey number, and then adds the player to a roster somewhere. If you use an integer to represent the position (1 = Pitcher 2 = Catcher and so forth) then you will have to decide what to do if another part of the system incorrectly calls this method with something below 1 or above 9. That would be a defect that the IDE/compiler/linker/loader would not find, because an int is type-safe for all values from minint to maxint [1]. So if the method was called with a 32, you'd have to put something in the code to deal with it: 32 mod 9 to determine what position that effectively is (Third Base if you're curious), correct the data (anything above 9 is reduced to 9, below 1 becomes 1), return a null, throw an IllegalPositionException to raise the alarm... something. Whatever the customer wants. Then you'd write a failing test first to drive it into the code.

If, however, you chose not to use an int, but rather create your own type with its own constraints... for example, an enumeration called PLAYER with members PITCHER, CATCHER, SHORTSTOP, etc... then a defect elsewhere that attempted to pass in PLAYER.QUARTERBACK would not compile and therefore would never make it into production. We can think of this as defect prevention even though it isn't really, it's just very early detection. But that vastly decreases the cost of repair.

Cost of Delays

The earlier you find the bug, the cheaper it is to fix. First of all, the issue is fresher in your mind and thus you don't have to recapitulate the thought process that got you there. It's less likely that you'll have more than one bug to deal with at a time (late detection often means that other bugs have arisen during the delay, sometimes bugs which involve each other) which means you can focus. Also, if you're in a very short cycle then the defect is something you just did, which makes it more obvious.

The worst time to find out a defect exists, therefore, is the latest time. It is when the system is operating either in the QA department's testing process or especially when actually in use by a customer. When QA finds the bug it's a delayed find. When a customer finds the defect it's further delayed but it also means:

The customer's business has suffered
The product's reputation is tarnished
Your organization's reputation is tarnished
It is personally embarrassing to you
And, as we said, the cost to fix will be much higher

In a perfect world this would never happen, of course, but the world is complex and we are prone to errors.

TDD and Time

In TDD we add another point in time when we can discover an error: test time. Not QA's testing but developer test time, test we run and thus create our own non-delayed moment of run time. Tests execute the system so they have the same "experience" as QA or a customer, but since we run them very frequently they represent a faster and more granular defect indication.

You would prefer to prevent all defects from making into runtime, of course. But you cannot. So a rule in TDD is this: any defect that cannot be prevented from getting into production must have a specification associated with it, and thus a test that will fail if the spec is not followed.

Since we write the tests as part of the code-writing process and if we adhere perfectly to the TDD rule that says "code is never put into the source without a failing test that requires it"... and if we see the test fails until the code is added which then makes it pass... then we should never have code that is not covered (and meaningfully so [2]) by tests. But here we're going to make mistakes too. Our good intentions will fall afoul of the forces they always do; fatigue, misunderstandings, things we forget, bad days and interruptions, the fat-fingered gods of chaos.

With TDD as your process certainly far fewer defects will make it into the product, but it it will still happen from time to time. But what that will mean will be different.

TDD and Runtime Defects

Traditionally a bug report from outside the team is placed into a tracking system and addressed in order of priority, severity, in the order they are entered, something along those lines. But traditionally addressed means fixed. This is not so in TDD.

In TDD a bug reported from production is not really a bug... yet. Because if all of our tests are passing and if our tests are the specification of the system, this means the code is performing as specified. There is no bug. But it is not doing what the customer wants so it is the specification that must be wrong: we have a missing test.

Therefore fixing the problem is not job #1; adding the missing test is. In fact, we want the defect in place so that when we 1) figure out what the missing test was and 2) add it to the suite we can 3) run it and see it fail. Then and only then we fix the bug and watch the new test go green, completely proving the connection between the test and the code, and also proving that the defect in question can never make it into production again.

That's significant. The effort engaged in traditional bug fixing is transitory; you found it and fixed it for now, but if it gets back in there somehow you'll have to find it and fix it again. In TDD the effort is focused more on adding the test, and thus it is persistent effort. You keep it forever.

Special Cases

One question that may be occurring to you is "what about bad behavior that gets into the code that really is not part of the spec and should never be?" For an example in the case of our baseball-player-accepting method above, what if a developer on the team adds some code that says "if the method gets called with POSITION.PITCHER and a jersey number of exactly 23, then add them to the roster twice." Let's further stipulate that no customer asked for this, it's simply wrong.

Could I write a test to guard against that? Sure; the given-when-then is pretty clear:

Given: a pitcher with jersey number 23
an empty roster

When: the pitcher is passed into method X once

Then: a pitcher with jersey number 23 will appear once in the roster

But I shouldn't. First of all, the customer did not say anything about this scenario, and we don't create our own specifications. Second, where would that end? How many scenarios like that could you potentially dream up? Combinations and permutations abound. [3]

The real issue for a TDD team in the above example is how did that code get into the system anyway? There was no failing test that drove it. In TDD adding code to the system without a failing test is a malicious attack by the development team on their own code. If that's what you're about then nothing can really stop you.

So the answer to this conundrum is... don't do that. TDD does not work, as a process, if you don't follow its rules in a disciplined way. But then again, what process would?

-S-

[1] You might, in fact, have chosen to do this because the rules of baseball told you to:
http://en.wikipedia.org/wiki/Baseball_positions

[2] What is "non-meaningful coverage"? I refer you to:
http://www.sustainabletdd.com/2011/12/lies-damned-lies-and-code-coverage.html

[3] I am not saying issues never arise with special cases, or that it's wrong to speculate; sometimes we discover possibilities the customer simply didn't think of. But the right thing to do when this happens is go back to the customer and ask what the desired behavior of the system should be under circumstance X before doing anything at all. And then write the failing test to specify it.

Net Objectives

Pages

Tuesday, January 20, 2015

TDD and Defects