Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Tuesday, January 20, 2015

TDD and Defects

We've said all along that TDD is not really about "testing" but rather about creating an executable form of specification that drives development forward.  This is true, and important, but it does not mean that TDD does not have a relationship to testing.  One interesting issue where there is significant synergy is in our relationship to defects.

Two important issues we'll focus on are: when/how a defect becomes known to us, and the actions we take at that point.

Time and Development


In the cyclic nature of agile development, we repeatedly encounter various points in time when we may discover that something is not right.  First, as we are writing the source code itself most modern tools can let us know that something is not the way we intended it to be.  For example when you end a method with a closed-curly-brace a good IDE will underline or otherwise highlight any temporary method variables that you created but never used.  Obviously if you created a variable you intended to use it so you must have done something other than you meant to.  Or, if you type an object reference name and then hit the dot, many IDE's will bring up a list of methods available for you to call on that type.  If the list does not appear then something is not right.

When compiling the source into the executable we encounter a number of points in time where the technology can check our work.  The pre-compiler (macros, if-defs, #defines), the compiler, the linker (resolving dependencies), and so forth.

And there are run-time checks too.  The class loader, generic type constraints, assertions of preconditions and postconditions, etc..  Various languages and technologies provide different levels of these services and they all can be "the moment" where we realize that we made an error that has resulted in a defect.

Detection vs. Prevention


Defects are inevitable and so we have to take action to either detect them or to prevent them.  Let's say for example that you have a method that takes as its parameters the position of a given baseball player on a team, and his jersey number, and then adds the player to a roster somewhere.  If you use an integer to represent the position (1 = Pitcher 2 = Catcher and so forth) then you will have to decide what to do if another part of the system incorrectly calls this method with something below 1 or above 9.  That would be a defect that the IDE/compiler/linker/loader would not find, because an int is type-safe for all values from minint to maxint [1].  So if the method was called with a 32, you'd have to put something in the code to deal with it: 32 mod 9 to determine what position that effectively is (Third Base if you're curious), correct the data (anything above 9 is reduced to 9, below 1 becomes 1), return a null, throw an IllegalPositionException to raise the alarm... something.  Whatever the customer wants.  Then you'd write a failing test first to drive it into the code.

If, however, you chose not to use an int, but rather create your own type with its own constraints... for example, an enumeration called PLAYER with members PITCHER, CATCHER, SHORTSTOP, etc... then a defect elsewhere that attempted to pass in PLAYER.QUARTERBACK would not compile and therefore would never make it into production.  We can think of this as defect prevention even though it isn't really, it's just very early detection.  But that vastly decreases the cost of repair.

Cost of Delays


The earlier you find the bug, the cheaper it is to fix.  First of all, the issue is fresher in your mind and thus you don't have to recapitulate the thought process that got you there.   It's less likely that you'll have more than one bug to deal with at a time (late detection often means that other bugs have arisen during the delay, sometimes bugs which involve each other) which means you can focus.  Also, if you're in a very short cycle then the defect is something you just did, which makes it more obvious.

The worst time to find out a defect exists, therefore, is the latest time.  It is when the system is operating either in the QA department's testing process or especially when actually in use by a customer.  When QA finds the bug it's a delayed find.  When a customer finds the defect it's further delayed but it also means:
  1. The customer's business has suffered
  2. The product's reputation is tarnished
  3. Your organization's reputation is tarnished
  4. It is personally embarrassing to you
  5. And, as we said, the cost to fix will be much higher
In a perfect world this would never happen, of course, but the world is complex and we are prone to errors.

TDD and Time


In TDD we add another point in time when we can discover an error: test time.  Not QA's testing but developer test time, test we run and thus create our own non-delayed moment of run time.  Tests execute the system so they have the same "experience" as QA or a customer, but since we run them very frequently they represent a faster and more granular defect indication.

You would prefer to prevent all defects from making into runtime, of course.  But you cannot.  So a rule in TDD is this: any defect that cannot be prevented from getting into production must have a specification associated with it, and thus a test that will fail if the spec is not followed.

Since we write the tests as part of the code-writing process and if we adhere perfectly to the TDD rule that says "code is never put into the source without a failing test that requires it"... and if we see the test fails until the code is added which then makes it pass... then we should never have code that is not covered (and meaningfully so [2]) by tests.  But here we're going to make mistakes too.  Our good intentions will fall afoul of the forces they always do; fatigue, misunderstandings, things we forget, bad days and interruptions, the fat-fingered gods of chaos.

With TDD as your process certainly far fewer defects will make it into the product, but it it will still happen from time to time.  But what that will mean will be different.

TDD and Runtime Defects


Traditionally a bug report from outside the team is placed into a tracking system and addressed in order of priority, severity, in the order they are entered, something along those lines.  But traditionally addressed means fixed.  This is not so in TDD.

In TDD a bug reported from production is not really a bug... yet.  Because if all of our tests are passing and if our tests are the specification of the system, this means the code is performing as specified.  There is no bug.  But it is not doing what the customer wants so it is the specification that must be wrong: we have a missing test.

Therefore fixing the problem is not job #1; adding the missing test is.  In fact, we want the defect in place so that when we 1) figure out what the missing test was and 2) add it to the suite we can 3) run it and see it fail.  Then and only then we fix the bug and watch the new test go green, completely proving the connection between the test and the code, and also proving that the defect in question can never make it into production again. 

That's significant.  The effort engaged in traditional bug fixing is transitory; you found it and fixed it for now, but if it gets back in there somehow you'll have to find it and fix it again.   In TDD the effort is focused more on adding the test, and thus it is persistent effort.  You keep it forever.

Special Cases


One question that may be occurring to you is "what about bad behavior that gets into the code that really is not part of the spec and should never be?"  For an example in the case of our baseball-player-accepting method above, what if a developer on the team adds some code that says "if the method gets called with POSITION.PITCHER and a jersey number of exactly 23, then add them to the roster twice."  Let's further stipulate that no customer asked for this, it's simply wrong.

Could I write a test to guard against that?  Sure; the given-when-then is pretty clear:

Given: a pitcher with jersey number 23
            an empty roster

When: the pitcher is passed into method X once

Then: a pitcher with jersey number 23 will appear once in the roster

But I shouldn't.  First of all, the customer did not say anything about this scenario, and we don't create our own specifications.  Second, where would that end?  How many scenarios like that could you potentially dream up?  Combinations and permutations abound. [3]

The real issue for a TDD team in the above example is how did that code get into the system anyway?  There was no failing test that drove it.  In TDD adding code to the system without a failing test is a malicious attack by the development team on their own code.  If that's what you're about then nothing can really stop you.

So the answer to this conundrum is... don't do that.  TDD does not work, as a process, if you don't follow its rules in a disciplined way.  But then again, what process would?

-S-

[1] You might, in fact, have chosen to do this because the rules of baseball told you to:
http://en.wikipedia.org/wiki/Baseball_positions

[2] What is "non-meaningful coverage"?  I refer you to:
http://www.sustainabletdd.com/2011/12/lies-damned-lies-and-code-coverage.html

[3] I am not saying issues never arise with special cases, or that it's wrong to speculate; sometimes we discover possibilities the customer simply didn't think of.  But the right thing to do when this happens is go back to the customer and ask what the desired behavior of the system should be under circumstance X before doing anything at all.  And then write the failing test to specify it.

Monday, January 19, 2015

Welcome Max Guernsey

Max has joined Net Objectives, as some of you may know, as a trainer, coach, and mentor.  We've been friends with Max for a long while, and he has been a contributor to this blog and to the progress of our thinking in general.

So, we're adding him to the official authorship here and when (if ever :)) we get this thing written, he will be co-author with Amir and I.

I know this has been terribly slow going, but hopefully with another hand at the oars we can pick up the pace.

-Scott-

Tuesday, October 21, 2014

TDD and Asychronous Behavior: Part 2



In  part 1, we discussed the benefits of separating out the code that ensures mutex (in this case, using thread locks) from the code that provides core behavior using a Synchronization Proxy.  The core behavior can be tested in a straightforward, single-threaded way.  What remains in terms of TDD and asynchronous behavior is how to effectively specify/test the Synchronization Proxy.

Testing the Synchronization Proxy


You might be saying “the proxy class is so simple, I’m not sure I’d need to drive its behavior from a specification/test.  All it does is take the lock and delegate” The level of rigor in your specifications is always a judgment call, so we’ll set aside whether a given proxy behavior needs a test. We’re going to focus on how to write such a test it in the case where you wish to. In other words: if you decide not to include it in the specification we want it to be because you decided not to, not because you didn’t know how.[3]

The given-when-then layout of the specification would be something along these lines:

Given:

    Threads A and B are running
    Thread A is running code T

When:

    Thread B attempts to run code T

Then:

    Thread B will wait until Thread A is done: the accesses with be serial, not parallel.

The key here is the word “until”. What the test needs to drive/specify/ensure is that the timing is right, that Thread B writes *after* Thread A even if Thread A takes a long time. Let’s look at an implementation sequence diagram.



Client A and Client B are inner classes of the test, created just to exercise the proxy, each in its own thread.  If the proxy did not add the synchronization behavior, the writes to Target would be 2, and then 1, because we tell the Target to wait 10 seconds before writing the state for Client A, but only 1 second for Client B.  If the proxy prevents this (proper behavior) then the writes will be 1, and then 2, because Client B couldn’t get the access until Client A was finished.

This is a partial solution, but it begs a few questions.
  1. How does the test get RealTarget to wait these different amounts of time?
  2. How does the test assert the sequence of these writes is 1, 2?
  3. If the RealTarget “waits 10 seconds” won’t the test execution be horribly slow?
The first two questions are answered by replacing RealTarget with a Mock Object[4]. Remember, we are not specifying RealTarget here, we are specifying the proxy’s behavior, therefore RealTarget must be controlled by the test. A mock allows this.

What about the time issue? Well, time is in scope and we certainly are not testing that time works. So we have to control it in the test as well.



Here’s the implementation sequence diagram with the mock object in place of RealTarget, and another object that replaces time.

Time is a simulator, which can be told by the test to “be” at any time we want. MockTarget basically calls on Time and says “let me know when we’ve reached or passed second x”. We use the Observer Pattern [5] to implement this. The first time the mock is called it will ask Time to notify it when 10 seconds have passed. The second time, it will ask for a 1 second notification. We do this with a simple conditional.

Furthermore MockTarget maintains a log of all calls made to it, in order, which the test can ask for to determine the sequence of setX() calls and assert that it is 1, 2 rather than 2, 1.

10 and 1 are not significant numbers, so as you’ll see in the code we made constants “longWait” and “shortWait” to be used in the test. It’s only important that the first thread waits longer than the second, and since time itself is being simulated anyway the “actual” lengths of time are unimportant. We can pretend they are one year and one hundred years if you want. It’s nice to control time. :)

MockTarget, Time, and the ClientA and ClientB objects are all part of the test, and so a good practice is to make them private inner classes of the test. Also the Observer interface and all constants used in this test are similarly part of the test itself. Remember, a test tests everything which is in scope but which the test does not control. The only thing not controlled by the test is the Synchronization Proxy.

We’ve coded all this up in C#. Click here to download the visual studio project.


[3] Perhaps later we’ll make our argument about whether you should or not. :)
[4] If you don't know about the Mock Object Pattern, visit this link:
http://www.netobjectives.com/PatternRepository/index.php?title=TheMockObjectPattern
[5] For more details on the Observer Pattern, visit this link:
http://www.netobjectives.com/PatternRepository/index.php?title=TheObserverPattern