Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Wednesday, January 11, 2012

Testing Best Practices: Test Categories, Part 1

Download the Podcast

Successfully adopting and practicing TDD in a sustainable way involves many distinctions, best-practices, caveats, and so forth.  One way to make such information accessible is to put in into a categorized context.  The Design Patterns, for instance, are often categorized into behavioral, structural, and creational.[1]  Here we will do a similar thing with the executable specifications (“tests”) we write when doing TDD.

We have identified four categories of unit tests, namely: functional, constant specification, creational, and work-flow.  We’ll take them one at a time.

Functional

The first unit test a developer ever writes is often an assertion against the return of a method call.  This is because systems often operate by taking in parameters and producing some kind of useful result.  For example, we might have a class called InterestCalculator with a method called CalcInterest() that takes some parameters (a value, a rate, a term, and perhaps the month to calculate for) and then returns the proper interest to charge or pay, depending on the application context.

The primary way of creating useful behavior in software is, in fact, in writing such methods.  However, how we test them will depend on the nature of the behavior.  We can, therefore, further sub-divide the ‘Functional’ category into the following types:

1. Static behavior

This is the simplest.  If a method produces a simple, non-variant behavior, then we simply need to pick some parameters at random, call the method, and assert that the result is correct.  For example:

// pseudocode
class Calculator {
    public int add(int x, int y) {
        return x + y;
    }
}

// pseudotest
class CalculatorTest {
    public void testAddBehavior() {
        int anyX = 6;
        int anyY = 5;
        int expectedReturn = 11;

        Calculator testCalculator = new Calculator();
        int actualReturn = testCalculator.add(anyX, anyY);
        assertEqual(expectedReturn, actualReturn);
    }
}  

Adding two numbers always works the same way, so all we need is a single assertion to demonstrate the behavior in order to specify it.  Note that we have named our temporary variables in the test anyX and anyY to make it clear that these particular values (5 and 6, respectively) are not in any way important, that the test is not about these values in particular.  The test is about the addition behavior, as implemented by the add()method.  We simply needed some input parameters in order to get the method to work, and so we picked arbitrary (any) values for our test. [2]

This is important, because we want it to be very easy for someone reading the test to be able to focus on the important, relevant part of the test and not on the “just had to do this” parts.  Here again, thinking of this as a specification leads us to this conclusion.

Static behavior is the same for all values of all parameters passed.  For example, f() here takes a single parameter, while g() takes two. But for all values of these parameters, the behavior is the same and so we pick "any" values to demonstrate this.

2. Singularity

If a behavior is always the same (static) except for one particular condition where it changes, we call this condition a singularity.

The classic example is divide-by-zero.  In division, the behavior is always the same unless the divisor is zero, in which case we need to report an error condition.  Here we’d need two assertions: one, like the one for static behavior, would pick ‘any’ two numbers but where the second is non-zero, then show the division, then another that shows the error report when the second number is zero.

It does not, of course, have to be a mathematical issue: it could be a business rule.   Let’s say, for example, that we charge a fee of $10 for shipping unless it is the first day of the month when we ship for free.  We’re trying to encourage sales at the beginning of the month.  Thus, the first day would be the singularity, and we’d write this test [3]:

public void testShippingIsFreeOnTheFirstDayOfTheMonth() {
    ShippingCalc shippingCalc = new ShippingCalc();
    int anyDateOtherThanTheFirst = 5;
    const int FIRST_DAY_IN_MONTH = 1;
    amount expectedStandardFee = ShippingCalc.STANDARD_FEE;
    const amount FREE = 0.00;
   
    Assert.AreEqual(expectedStandardFee,
      shippingCalc.getFee(anyDateOtherThanTheFirst);
    Assert.AreEqual(FREE,
      shippingCalc.getFee(FIRST_DAY_IN_MONTH);
}

Note the use of the term “any” for the date we “don’t care about, they’re all the same”, which we call anyDateOtherThanTheFirst , and then the fact the FIRST_DAY_IN_MONTH is clearly special.

Another example would be choosing a specific behavior for one element in a set. For example if some function is legal only for one type of user, and all other types should get an exception:


enum User = {REGULAR, ADMIN, GUEST, SENIOR, JUNIOR, PET}


public void testOnlyAdminCanGetCoolStuff() {
    StuffGetter getter = new StuffGetter();
    Stuff stuff;

    int anyNonAdmin = Users.REGULAR;
    try {
        stuff = getter.getCoolStuff(anyNonAdmin);
        Assert.Fail("Cool stuff should go to ADMIN only"); 
    } catch (PresumptionException) {}
    stuff = getter.getCoolStuff(User.ADMIN);
    Assert.True(stuff.IsCool());
}

Two examples.... f() with it's single parameter provides the same behavior for all values but one... the point indicated.  With the two parameters g() takes, the singularity may involve them both, creating a point, or it may only pertain to one, creating a line.  For instance, if x is "altitude" and y is "temperature" then a point might indicate "same behavior for all values except 3000 feet and 121 degrees.  The line might indicate "the same behavior for all values except 2000 feet at any temperature".

3. Behavior with a boundary

Sometimes the behavior of a method is not always uniform, but changes based on the specific parameters it is passed.  For example, let’s say that we have a method that applies a bonus for a salesperson, but the bonus is only granted if the sale is above a certain minimum value, otherwise it is zero.  Further, the customer tells us that pennies don’t count, the sale must be an entire dollar over the minimum sales value.:

In this case there exists a special sales amount, which affects the behavior of the getBonus() function.  We need to specify this boundary -- the place where the behavior changes -- and since every boundary has two sides, we need to explicitely specify these values and relate them:

class SalesApplicationTest {
  public void testBonusOnlyAppliesAboveMinumumSalesForBonus() {
    double maxNotEligibleAmount =
      SalesApplication.BONUS_THRESHOLD + .99;
    double minEligibleAmount =
      SalesApplication.BONUS_THRESHOLD + 1.00;
    double expectedBonus = minEligibleAmount *
      SalesApplication.CURRENT_BONUS
    SalesApplication testSalesApp = new SalesApplication();

    AssertEqual(0.00,
      testSalesApp.applyBonus(maxNotEligibleAmount);
    AssertEqual(expectedBonus,
      testSalesApp.applyBonus(minEligibleAmount);
    }
}

This specifies, to the reader, that the point of change between no bonus and the bonus being applied is at the BONUS_THRESHOLD value, and also (per the customer) that the sale must be a full dollar above the minimum before the bonus will be granted.  This is called the epsilon, the atom of the change, and you’ll note that we are clearly demonstrating it as one penny, the penny that takes us from 99 cents over the minimum to 1 full dollar over it.

One might be tempted to assert against other values, like 200 dollars over the minimum, or .32 cents above it, or loop through all possible values above and below the transition point.  Or to pick “any” value above and “any” value below.  The point is that .99 cents and 1 dollar are significant amounts over the minimum, they matter to the customer, and so we need to specify them as unique.

We also want our tests to run fast, and so looping though all possible values is not only unnecessary, it is counter-productive.


Two points define the boundary where behavior changes, and we also demonstrate the epsilon (or atom) of change.


4. Behavior within a range

There can be, of course multiple boundaries that change behavior.  If these boundaries are independent of each other, then we call this a range.

For example, let us say that the acceptable temperature of an engine manifold must be between 32.0 and 212.00 degrees Fahrenheit (too cold, and the engine freezes, too hot and it overheats).  These are not related to each other (we could install anti-freeze to make lower temperatures acceptable while the upper limit might not change, or vice-versa using coolant), and so each would be specified with two asserts, one at and one above the boundary in each case.

But let’s not forget the epsilon!  How much is “over” or “under”?  One degree?  One tenth of a degree?  Ten degrees?  How sensitive should this system be?  Here again, this is a problem domain specification, and thus we have to know what the customer wants before we can create the test.

Also, note that whereas for integers the natural epsilon is 1, for floating point numbers that epsilon value depends on the base number. The larger it is, the larger the epsilon needs to be. Constants such as Double.Epsilon only indicate the smallest possible number, not the smallest discernible difference between values.




 Two boundaries, with epsilons for each.  Note the boundaries of a simple range are not related to each other.

[1] In point of fact, we don’t actually completely agree with this method of categorizing the Design Patterns, but it does serve as a reasonable example of categorization in general.

[2] There are other ways to do this.  In another blog we will discuss the use of an “Any” class to make these “I don’t care” values even more obvious.

Continued in Test Categories, Part 2






14 comments:

  1. What's especially hard is to convince anyone to do those kind of "constant specifications". The main argument raised is that "this is an academic discussion" and "I've never seen this part to become wrong which would lead to the situation where such test would fail". Also, some people tend to say that this is as much overspecification as stating in the doc document: "we're supporting installation on hard drives and the letters can be C, D, E, F, G...". It would be really helpful if you gave an example from your experience that this is no overspecification and, putting ideas and concepts aside, it really paid back.

    Thanks for the podcast! I really enjoyed listening to it!

    ReplyDelete
  2. Hi astral,

    we will address the fine details of 'constant specification' in (probably) the third installment of this blog thread. Nonetheless, I'll quickly answer your question here.

    The benefit of specifying contansts include:
    1. The specification is cohesive, all the information that comprise the specification is in one place, and there's no need to look in the code.
    2. It answers the question - "where in the code is this specified, and what is the value?"
    3. It ensures that we have a discipline. You just do it.
    4. It allows to safely evolve from contants to config files to database info
    5. It ensures that the constant, rather than it's value be used in the code.

    Note that the definition of "constant" includes enumerations and types (which are (or should be)) specialized instances of types.

    This reply, and the comment will be removed from this location once the relevant blog will be published, thanks!

    ReplyDelete
  3. Thanks, Amir, this sums up to quite a few advantages! I'm wondering about point 4 whether I understand what the real safety is, since moving constants to configuration or database cuts us from the actual values (AFAIK we want to be decoupled from the actual database or config file or any other storage, so the specs are going to be probably rewritten). Maybe it's that it holds me off from jumping into the wild and making the migration to more complex mechanism on a "There, this should work" basis. So in such case, the specs are going to fail and make me really think that I need to specify this new behavior instead of thinking "Well, this never had tests, no point in writing them now"?

    This makes me look forward to your future blogs even more. It's really fun drawing from your experience, since for us beginners, TDD is usually like reading a cooking recipe with steps like "add few spoons of sugar" and we constantly get stuck on what the 'few' really means :-).

    ReplyDelete
    Replies
    1. A quick example:

      Assume that in your code you have:

      Assert.AreEqual(16, Ride.MinimumAge);

      This tells us that the minimum age for the Ride is 16.
      It is provided by a constant called MinimumAge.
      The constant is owned by the Ride class.

      Now, lets assume that we won't the minimum age to be configurable through a config file or a database so that the values can be seen elsewhere.

      As far as anyone is concerned, the Ride is still responsible for this value. All we need to do is to make it a property (in .Net) or a getter, thus:

      Assert.AreEqual(16, Ride.MinimumAge());

      Now, whichever way MinimumAge() is implemented:
      - returning a constant
      - parsing and reading a config file
      - running an sql query
      will still require that the value retrived is 16.

      Later on, if we change it to:
      Assert.AreEqual(15, Ride.MinimumAge());

      we can rest assured that whichever way we've implemented it, it will be changed in the right place.

      Delete
    2. How about decoupling from the database or config file in such case? Are you rather for or against using files and databases in your tests? Because if we're decoupled, we've nowhere to get this value of 15 from.

      I'm sure you're going to mention this in your future blog post, so If you don't want to end up publishing whole post as an answer to my question :-), I can just wait for the answer to show up together with a post on the blog.

      Thanks!

      Delete
    3. For the purpose of THIS discussion, it really doesn't matter. The access to the database, config file or registry is encapsulated by the access method. it is the access method's responsibility to provide the right value.

      Naturally, it would be good design to further hide these details by decoupling the storage details from the access method, using a facade, for example.

      Delete
    4. Ok, I think I get it now. So, in such case, the "constant test" would be helpful in migrating, let's say to database, as a temporary step: first, I migrate the mechanism and even put the database temporary on my machine just to make sure that the mechanism works (in other words, I'm temporarily creating something that some people would call 'an integration test') and when I ensure it does, THEN I can do further decoupling to what I described.

      Is my understanding correct?

      Delete
    5. One more thing that I think I did not understand fully from your podcast until now - the systems I'm usually looking at are already built, with lots of constants already defined, however, when I'm creating a system or a new component, I don't really know what's going to end up as a configuration item and it really makes sense to delay this decision. So, while developing the system, I may put many constants that will evolve into configuration as new requirements emerge in the process of development.

      Delete
  4. Oh, by the way, I remember you guys mentioning such thing as "triangulation tests". I hope you include in your book a section or chapter on how to use and evolve them.

    ReplyDelete
  5. Hi Amir & Scott,

    Good stuff as always. A couple thoughts and a question.

    - The more I push toward "single assert per test," the more I find the tests easier to follow. AAA also becomes an easy standard to apply, and many tests can shoot for the ideal goal of 3 lines or fewer (particularly if you externalize constant declarations). I don't always have a single assert, but I do prefer to split declared case (i.e. what the test name says) and opposing case into two tests.

    - I too have failed to see the value in the categories for design patterns--I can't ever remember actively thinking about whether a pattern was structural or not, for example (and I've written about them all).

    With that in mind, perhaps you could expand on your intro discussion and offer a bit more thought about how you've found it valuable (other than making information accessible) to categorize unit tests in a similar fashion.

    Regards,
    Jeff

    ReplyDelete
  6. >>> With that in mind, perhaps you could expand on your intro discussion and offer a bit more thought about how you've found it valuable (other than making information accessible) to categorize unit tests in a similar fashion

    - Valid question, and I'm happy to answer it.
    If you follow good design principles, one of the qualities you pay attention to is COHESION.

    Coshesion is about single purposeness. We achieve cohesion by techniques like Programming by Intention, and Separation of Use from Construction. The single purpose is usually indicated by the fact that the operations which the purpose comprises are operating on the same state. As long as the state of an entity (e.g., a class) remains encapsulated you can continue to strip code away (e.g, by moving methods to another class).

    That said, good design models the problem domain, and captures its behavior. Therefore, the single purposeness that we discussed above is not a pure code construct, but exists in the problem domain. Not surprisingly, it also exists in the tests that help drive these cohesive software entities.

    Therefore, the categories that we have used reflect are used to categorize the atmoic, cohesive behaviors modeled in the code. This is unlike the high level categorization of design patterns.

    We have found that this basic categorization, and the rules on how to "test" them are very useful in creating clarity and guiding the analysis process which is "test first".

    ReplyDelete
  7. Hi guys, I think there might be a typo in this code snippet. It looks like the try statement never has a "}":

    public void testOnlyAdminCanGetCoolStuff() {
    StuffGetter getter = new StuffGetter();
    Stuff stuff;

    int anyNonAdmin = Users.REGULAR;
    try {
    stuff = getter.getCoolStuff(anyNonAdmin);
    Assert.Fail("Cool stuff should go to ADMIN only");

    catch (PresumptionException) {}
    stuff = getter.getCoolStuff(User.ADMIN);
    Assert.True(stuff.IsCool());
    }

    ReplyDelete