Net Objectives

Net Objectives
If you are interested in coaching or training in ATDD or TDD please click here.

Thursday, August 3, 2017

TDD: Testing Behavior in Abstract Classes



Interfaces vs. Abstract Classes

In languages like Java and C#, developers can use either an interface or an abstract class to create object polymorphism.  It’s a common question in technical training: “It is best to use an interface, or an abstract class?”

Furthermore, many teams adopt the “I” naming convention for interfaces; namely that an interface’s name should start with a capital I, whereas other classes (including abstract classes) should not.  The problem with this convention is that it creates design coupling.  Client objects that contain references to service objects must be changed when a simple, concrete class must be changed to become an abstraction, if an interface is to be used to model it.  Should client objects care whether a service is a concrete class, abstract class, or interface? No.  This would seem to argue against this naming convention in the first place.

But the real problem stems from the fact that the “interface” type is commonly used for two very different purposes: to create polymorphism and to mark a class as a valid participant in a framework process.  For example, a class can implement “ISerializible”, not for casting purposes per se, but so it can be serialized by .Net or a similar framework.  This may be a tangential issue to the class’ core responsibility.  On the other hand, 10 different versions of a tax calculation algorithm implemented by 10 different tax calculation classes can all implement “ITaxCalc” so that they can be cast up and dealt with in the same way by various client classes.  This would create polymorphism around the central responsibility of all the classes involved: calculating taxes.  If we had started with a single algorithm, a concrete class called TaxCalc, and this was referred to across the system by that name, then when we evolve the system to support different algorithms and thus the class becomes an interface, then the type name would change (if the “I” convention is used) and all client code will have to be maintained.

Different Purposes, Different Approaches

It seems like a bad idea to use one idiom for two unrelated purposes.
Personally, I prefer to create polymorphism using abstract classes, and to mark a class for participation in a framework process using interfaces.

Part of my argument is this:  when many different classes have a conceptual relationship, such as the tax calculators mentioned above, then it is likely they will also contain some code in their implementation that is the same.  This yields redundancy that creates maintenance problems when requirements due to tax laws and regulations change (for example).  An abstract class can implement common functionality, whereas interfaces cannot.  Even if a set of related classes contains no redundant implementation today, redundancies can emerge over time.  Abstract classes make this problem easy to solve whenever it arises.

Also, if I limit the use of interfaces to process flags, then the “I” convention is less of an issue.  I do not create design coupling within my system if I use it, because, for example, “IComparable” is not my interface, it allows a collection of classes to be sorted by a framework.  It belongs to that framework and is highly unlikely to be changed, due to the chaos this would create in everyone’s code if it were to be.  In any case, I don’t control its name.

TDD and Common Behaviors

If an abstract class is used to create polymorphism, and if there is indeed some common functionality in the base class, then the question arises: how do I test that behavior?  One cannot instantiate an abstract class, and thus its behavior cannot be triggered by a test unless that behavior is in a static method.  Static methods are disfavored for a number of reasons (I’ll deal with those in another blog), and I certainly would not make the behavior static just for testing purposes.  So what should a TDD practitioner, or a traditional tester, do about testing instance behavior that is implemented in an abstract class?

Here is a completely generic example:

Each “ConcreteService” version would have its own test, for each implementation version of the “VaryingFunction()” method.  But how would one write a test for the “CommonFunction()” method if it were an instance method, and one cannot create a instance?

Initially you might say “well, just pick any of the subclasses, create an instance of it and test the common function there, as they all have access to it.”  The problem is that this creates coupling in the test to the concrete service class that you arbitrarily chose.  If you happened to pick “ConcreteService1”, for instance, and later that class were to be retired/eliminated due to changing requirements, then the test of the common function would break even though that function is working fine.  Similarly, if "ConcreteService1" at some point in the future were to be changed to override the "CommonFunction()" method, this will also break the test.  We want tests that fail only for the reason we wrote them to.

Another Use for a Mock Object

Mock objects[1] are used to break and control dependencies in testing.  Here we can use a mock to eliminate coupling from the test of the common function to any of the concrete production classes.

This mock, like any subclass, has access through inheritance to the common function, but unlike other subclasses the mock:
  1. Is not part of the production code, but actually part of the test namespace/package/etc…
  2. Is never eliminated due to a changing requirement.  It is really part of the test.
  3. Is not a public class.  It is only visible to the tests.
Another advantage of this approach is that it makes it easier to test base-class behavior that is not exposed to the system in general (not public).

This is a pattern, a “Testing Class Adapter”[2].  It works because the test will hold the “Mock Service” by its concrete type, not in an upcast, and thus this new accessor method, which is public, can be called to access the protected method in the base class.  Again, this mock is not part of production, and thus does not break encapsulation in general, only for testing.

Summary

I prefer to use abstract classes to create polymorphism, and use interfaces to flag classes as participants in framework services.  When you do this, you can easily eliminate functional redundancies in derived classes by pushing them up into the base class.  To test this otherwise-redundant functionality use a mock object/testing class adapter to access it.
---
[1] For more on Mock Objects see:

[2] For more on the Adapter Pattern see:

8 comments:

  1. Hi, thanks for the post. A quick question: is using abstract classes for polymorphism, not ruling out interface segregation in languages such as Java and C#? For example, we have some kind of cache and some pieces of code write to the cache while others read from the cache. With interfaces, I could create interfaces like DataSource and DataDestination and make my cache implement both of them. This would give me the benefit that if I was e.g. to synchronize all writes to the cache, I could create a synchronizing proxy implementing only the DataDestination interface. Would you use interfaces in such case or solve it somehow using abstract classes?

    By the way, interfaces can (to a degree) implement common functionality. In case of Java - by means of default methods. In case of C# - by using extension methods (I know extension methods are just static methods under the hood, but the only difference between this and real "methods inside interfaces" that I can think of is that extension methods cannot be overridden, which may not even be required if we are talking about common logic).

    ReplyDelete
    Replies
    1. I do believe in interface segregation, but I prefer a different method to achieve it.

      If a class has an interface that supports multiple operations (like the read/write capability you're referring to), this is a "capability" interface. If different clients need a different view of this capability, I prefer to use adapters to provide it. These are "needs" interfaces, and should be kept separate.

      My favorite example is a bank service for transferring funds. The capabilities interface would allow transferring funds from any account to any other account, and any amount. The adapter for the ATM would use that interface, but only allow funds to be transferred from a given user's account. Another adapter, for the bank teller application, would allow any account to any account, but only under a limited cap amount, etc...

      Delete
    2. The example you gave about a bank service was where a capability interface is shaped differently than the need interface, which warrants an adapter. I find it a bit different than the case of "read interface" and "write interface" for the cache, as both of these interfaces are merely subsets of the set of methods that cache has.

      Of course, one could argue that this is merely a special case where the only benefit from using a need interface is to provide a subset of functionality. We could further say that seeing "programming language interface" always as a need interface and its implementation as a capability interface leads us to a conclusion that "programming languages interfaces" are nothing more than a language-provided mechanism for generating pass-through adapters. However, I don't find that way of looking at "programming language interfaces" particularly useful. I find the need-capability distinction useful when there is more than one client trying to use the same capability (which is what your example of a bank service is about if I am not mistaken). When I think about a class that fulfills two separate needs (in other words, all clients' needs are merely subsets of its capabilities), I'd rather think that it plays two roles and I use interfaces to model these roles. That still leaves me the freedom of introducing adapters later if I want to.

      Delete
    3. This comment has been removed by the author.

      Delete
    4. On the more practical side, abstract classes are a lot like interfaces and could be treated completely interchangably if:

      - no class would need to implement more than one interface
      - I would not need to put any implementation into the abstract class

      The rest is discipline. Still, I find this "discipline part" quite important. I find that using abstract classes for polymorphism and putting common logic there often leads to the following things happening in the code:

      1) By moving common code to an abstract class, we may also move there a troublesome dependency. The clients of the abstract class will not see this dependency, but will become transitively coupled to it. This can lead to library dependency issues if the dependency is a 3rd party library, or to circular dependencies between packages/modules/assemblies. This can hurt mobility as when we try to extract the clients of the abstract class to another library, we have to take the abstract class with us and this in turn requires taking the dependencies introduced by "common logic" with us.

      2) Implementation inheriance tends to create implicit two-way coupling between a superclass and a subclass. Sure, this is coupling through abstraction as an abstract class does not know its implementers but still. My English is not perfect, so I'd better try explaining this with an example. When I have an abstract class A and its subclass B, then both of these are true: a) A can call methods on B ("template method" approach) and b) B can call methods from A ("common util approach"). If I were to translate this into composition instead of inheritance, I would get a class A that has a reference to B (or rather "IB" as this is not a direct dependency) and B that has a reference to A. Now, this doesn't look too bad yet. After all, I sometimes introduce such composition constructs myself (and some patterns demand it, e.g. Visitor or State). There are, however, two things that I many times saw leading it to a mess: a) the assumption that each time I have some common logic, the right way to pull it is upwards the hierarchy, b) the ability to have more than two levels of inheritance. I saw hierarchies that look like this: A <|- B <|- C where A called method from B that called method from A that called method from C that called method from B. This is because of the assumption about the upwards direction of pulling "common logic". Translating it to composition, it would be similar to situation where we would have A<->B<->C (or more precisely, A->IB<|-B->IC<|-C + C->B->A) and somehow we would assume that when we have common logic, we move it towards class A. This in turn led may times to complete displacement of responsibilities and when I was to extract any meaningful logic form these classes e.g. to reuse it somewhere else, I would find myself jumping up and down the hierarchy, collecting bits of code and resolving circular calls.

      Sorry for bad narration and uglu ASCII arts, anyway, these are the two reasons why I prefer shallow hierarchies and (almost) strict interface intheritance. In fact, If I try to remember when last did I write an abstract class in other code than legacy, that was probably several years ago :-).

      Delete
  2. Another question. You wrote: "But the real problem stems from the fact that the “interface” type is commonly used for two very different purposes: to create polymorphism and to mark a class as a valid participant in a framework process". The question is: how is that not the case for abstract classes? I think I've seen at least several examples of frameworks where your class needs to inherit from "abstract web controller" or "abstract window" making this class automatically discoverable by the framework's reflection-based scanning mechanism. Would you call something like this participation in a framework process or do you consider this a different beast that ISerializable?

    ReplyDelete
    Replies
    1. I have seen these too, and I think it is a poor choice to create a framework. When you use inheritance you eliminate it for other purposes (in single root languages like Java and C#). It also means that existing behavior might not be able to be incorporated into a new framework.

      I prefer the "pluggable adapter" approach, where the framework provides an interface that the application developer can write an adapter to. Thus, any behavior can be incorporated into a framework, even if it was created before the framework existed.

      This also follows the general "Gang of Four" advice of preferring composition over inheritance.

      Delete
    2. My preference of dealing with this kind of "smell" is similar and I agree that this is rather poor framework design. My point was I have a feeling that abstract classes tend to suffer from the same ambiguity of usage so if I were to rule out interfaces on this basis, I find no reason not to reject abstract classes for the same reason.

      Looking at it more widely, I can hardly find a concept of a programming language that has one strict usage. Even classes themselves have different usages. For example, when creating a fluent interface/internal DSL, classes are sometimes used solely for the purpose of "enforcing grammar" (e.g. we may have a method called When() in our internal DSL that returns a different object only to stop us from writing When().When().When().When(), which in our "DSL grammar" would be an invalid sentence).

      Delete