DaedTech

Stories about Software

By

Testable Code is Better Code

It seems pretty well accepted these days that unit testing is preferable to not unit testing. Logically, this implies that most people believe a tested code base is better than a non-tested code base. Further, by the nature of testing, a tested code base is likely to have fewer bugs than a non-tested code base. But I’d like to go a step further and make the case that, even given the same amount of bugs and discounting the judgment as to whether it is better to test or not, unit-tested code is generally better code, in terms of design and maintainability, than non-unit-tested code.

More succinctly, I believe that unit testing one’s code results in not just fewer bugs but in better code. I’ll go through some of the reasons that I believe that, and none of those reasons are “you work out more bugs when you test.”

It Forces Reasoning About Code

Let’s say that I start writing a class and I get as far as the following:

public class Customer
{
    public int Id { get; set; }

    public string LastName { get; set; }

    public string FirstName { get; set; }
}

Pretty simple, but there are a variety of things right off the bat that can be tested. Can you think of any? If you don’t write a lot of tests, maybe not. But what you’ve got here is already a testing gold mine, and you have the opportunity to get off to a good start. What does Id initialize to? What do you want it to initialize to? How about first and last name? Already, you have at least three tests that you can write, and, if you favor TDD and don’t want nulls or zeroes, you can start with failing tests and make them pass.

It Teaches Developers about the Language

A related point is that writing unit tests tends to foster an understanding of how the language, libraries, and frameworks in play work. Consider our previous example. A developer may go through his programming life in C# not knowing what strings initialize to by default. This isn’t particularly far-fetched. Let’s say that he develops for a company with a coding standard of always initializing strings explicitly. Why would he ever know what strings are by default?

If, on the other hand, he’s in the practice of immediately writing unit tests on classes and then getting them to pass, he’ll see and be exposed to the failing condition. The unit test result will say something like “Expected: String.Empty, Was: null”.

And that just covers our trivial example. The unit tests provide a very natural forum for answering idle questions like “I wonder how x works…” or “I wonder what would happen if I did y…” If you’re working on a large application where build time is significant and getting to a point in the application where you can verify an experiment is non-trivial, most likely you leave these questions unanswered. It’s too much of a hassle, then the alternative, creating a dummy solution to test it out, may be no less of a hassle. But, sticking an extra assert in an existing unit test is easy and fast.

Unit Tests Keep Methods and Classes Succinct and Focussed

public void ChangeUiCar()
{
    try
    {
        Mouse.OverrideCursor = Cursors.Wait;
        MenuItem source = e.OriginalSource as MenuItem;

        if (source == null) { return; }

        ListBox ancestor = source.Tag as ListBox;

        if (ancestor == null) return;

        CarType newType = (CarType)Enum.Parse(typeof(CarType), ancestor.Tag.ToString());

        var myOldCar = UIGlobals.Instance.GetCurrentCar();
        var myNewCar = UIGlobals.Instance.GetNewCar(newType);

        if (myNewCard.Manufacturer == "Toyota" || myNewCar.Manufactuer == "Hyundai" || myNewCar.Manufacturer == "Fiat")
        {
            myNewCar.IsForeign = true;
        }
        else if (myNewCar.Manufactuer == "Ford" ||
            (myNewCar.Manufacturer == "Jeep" && myNewCar.WasMadeInAmerica)
            || (myNewCar.Manfacturer == "Chrysler" && myNewCar.IsOld))
        {
            myNewCar.IsForeign = false;
        }

        try
        {
            UpdateUiDisplay(myNewCar, true, false, 12, "dummy text");
        }
        catch
        {
            RevertUiDisplay(myOldCar, true, false, 0, "dummy text");
        }

        if (myNewCar.HasSunRoof || myNewCar.HasMoonRoof || myNewCar.HasLeatherSeating || myNewCar.HasGps ||
            myNewCar.HasCustomRims || myNewCar.HasONBoardComputer)
        {
            bool isLuxury = CarGlobals.Instance.DetermineLuxury(myNewCar);
            if (isLuxury)
            {
                if (myNewCar.IsForeign && myNewCar.IsManualTransmission)
                {
                    myNewCar.DisplaySportsCarImage = true;
                }
                if (myNewCar.IsForeign)
                {
                    myNewCar.DisplayAmericanFlag = false;
                    if (myNewCar.HasSpecialCharacters)
                    {
                        myNewCar.DisplaySpecialCharacters = UIGlobals.Instance.CanDisplayHandleSpecialCharacters();
                        if (myNewCar.DisplaySpecialCharacters)
                        {
                            UpdateSpecialCharacters(myNewCar);
                        }
                    }
                    else
                    {
                        UIGlobals.Instance.SpecialCharsFlag = "Off";
                    }
                }
            }
        }

    }
    finally
    {
        Mouse.OverrideCursor = null;
    }
}

This is an example of a method you would never see in an actively unit-tested code base. What does this method do, exactly? Who knows… probably not you, and most likely not the person or people that ‘wrote’ (cobbled together over time) it. (Full disclosure–I just made this up to illustrate a point.)

We’ve all seen methods like this. Cyclomatic complexity off the charts, calls to global state sprinkled in, mixed concerns, etc. You can look at it without knowing the most common path through the code, the expected path through the code, or even whether or not all paths are reachable. Unit testing is all about finding paths through a method and seeing what is true after (and sometimes during) execution. Good luck here figuring out what should be true. It all depends on what global state returns, and, even if you somehow mock the global state, you still have to reverse engineer what needs to be true to proceed through the method.

If this method had been unit-tested from its initial conception, I contend that it would never look anything like this. The reasoning is simple. Once a series of tests on the method become part of the test suite, adding conditionals and one-offs will break those tests. Therefore, the path of least resistance for the new requirements becomes creating a new method or class that can, itself, be tested. Without the tests, the path of least resistance is often handling unique cases inline–a shortsighted practice that leads to the kind of code above.

Unit Tests Encourage Inversion of Control

In a previous post, I talked about reasoning about code in two ways: (1) command and control and (2) building and assembling. Most people have an easier time with and will come to prefer command and control, left to their own devices. That is, in my main method, I want to create a couple of objects and I want those objects to create their dependencies and those dependencies to create their dependencies and so on. Like the CEO of the company, I want to give a few orders to a few important people and have all of the hierarchical stuff taken care of to conform to my vision. That leads to code like this:

class Engine
{
    Battery _battery = new Battery();
    Alternator _alternator = new Alternator();
    Transmission _transmission = new Transmission();
}

class Car
{
    Engine _engine = new Engine();
    Cabin _cabin = new Cabin();
    List _tires = new List() { new Tire(), new Tire(), new Tire(), new Tire() };

    Car()
    {

    }

    void Start()
    {
        _engine.Start();
    }
}

So, in command and control style, I just tell my classes that I want a car, and my wish is their command. I don’t worry about what engine I want or what transmission I want or anything. Those details are taken care of for me. But I also don’t have a choice. I have to take what I’m given.

Since my linked post addresses the disadvantages of this approach, I won’t rehash it here. Let’s assume, for argument’s sake, that dependency inversion is preferable. Unit testing pushes you toward dependency inversion.

The reason for that is well illustrated by thinking about testing Car’s “start” method. How would we test this? Well, we wouldn’t. There’s only one line in the method and it references something completely hidden from us. But, if we changed Car and had it receive an engine through its constructor, we could easily create a friendly/mock engine and then make assertions about it after Car’s start method was called. For example, maybe Engine has an “IsStarted” property. Then, if we inject Engine to Car, we have the following simple test:

var myEngine = new Engine();
var myCar = new Car(myEngine);
myCar.Start();

Assert.IsTrue(myEngine.IsStarted);

After you spend some time unit testing regularly, you’ll find that you come to look at the new keyword with suspicion that you never did before. As I write code, if I find myself typing it, I think “either this is a data transfer object or else there better be a darned good reason for having this in my class.”

Dependency-inverted code is better code. I can’t say it any plainer. When your code is inverted, it becomes easier to maintain and requirements changes can be absorbed. If Car takes an Engine instead of making one, I can later create an inheritor from Engine when my requirements change and just give that to Car. That’s a code change of one modified line and a new class. If Car creates its own Engine, I have to modify Car any time something about Engine needs to change.

Unit Testing Encourages Use of Interfaces

By their nature, interfaces tend to be easier to mock than simple instances–even virtual ones. While I can’t speak to every mocking framework out there, it does seem to be a rule that the easiest way to mock things is using interfaces. So when you’re testing your code, you’ll tend to favor interfaces when all things are equal, since that will make test writing easier.

I believe that this favoring of interfaces is helpful for the quality of code as well. Interfaces promote looser coupling than any other way of maintaining relationships between objects. Depending on an interface instead of a concrete implementation allows decoupling of the “what” from the “how” question when programming. Going back to the engine/car example, if I have a Car class that depends on an Engine, I am tied to the Engine class. It can be sub-classed, but nevertheless, I’m tied to it. If its start method cannot be overridden and throws exceptions, I have to handle them in my Car’s start method.

On the other hand, depending on an engine interface decouples me from the engine implementation. Instead of saying, “alright, specific engine, start yourself and I’ll handle anything that goes wrong,” I’m saying, “alright, nameless engine, start yourself however it is you do that.” I don’t necessarily need to handle exceptions unless the interface contract allows them. That is, if the interface contract stipulates that IEngine’s start method should not throw exceptions, those exceptions become Engine’s responsibility and not mine.

Generally speaking, depending on interfaces is very helpful in that it allows you to make changes to existing code bases more easily. You’ll come to favor addressing requirements changes by creating new interface implementations rather than by going through and modifying existing implementations to handle different cases.

Regularly Unit Testing Makes You Proactive Instead of Reactive About the Unexpected

If you spend a few months unit testing religiously, you’ll find that a curious thing starts to happen. You’ll start to look at code differently. You’ll start to look at x.y() and know that, if there is no null check for x prior to that call, an exception will be thrown. You’ll start to look at if(x < 6) and know that you’re interested in seeing what happens when x is 5 and x is 6. You’ll start to look at a method with parameters and reason about how you would handle a null parameter if it were passed in, based on the situation. These are all examples of what I call “proactive,” for lack of a better term. The reactive programmer wouldn’t consider any of these things until they showed up as the cause of a defect.

This doesn’t happen magically. The thing about unit tests that is so powerful here is that the mistakes you make while writing the tests often lead you to these corner cases. Perhaps when writing the test, you pass in “null” as a parameter because you haven’t yet figured out what, exactly, you want to pass in. You forget about that test, move on to other things, and then later run all of your tests. When that one fails, you come back to it and realize that when null is passed into your method, you dereference it and generate an unhandled exception.

As this goes on over the course of time, you start to recognize code that looks like it would be fragile in the face of accidental invocations or deliberate unit test exercise. The unit tests become more about documenting your requirement and guarding against regression because you find that you start to be able to tell, by sight, when code is brittle.

This is true of unit tests because the feedback loop is so tight and frequent. If you’re writing some class without unit tests, you may never actually use your own class. You write the class according to what someone writing another class is going to pass you. You both check in your code, never looking at what happens when either of you deviate from the expected communication. Then, three months later, someone comes along and uses your class in another context and delivers his code. Another three months after that, a defect report lands on your plate, you fire up your debugger, and figure out that you’re not handling null.

And, while some learning will occur in this context, it will be muted. You’re six months removed from writing that code. So while you learn in principle that null parameters should be handled, you aren’t getting feedback. It’s essentially the difference between a dieter having someone slap his hand when he reaches for a cookie, or weigh him six months later and tell him that he shouldn’t have eaten that cookie six months ago. One is likely to change habits while the other is likely to result in a sigh and a “yeah, but are ya gonna do?”

Conclusion

I can probably think of other examples as well, but this post is already fairly long. I sincerely believe that the simple act of writing tests and getting immediate feedback on one’s code makes that person a better programmer more quickly than ignoring the tests. And, if you have a department where your testers are all writing tests, they’re becoming better designers/programmers and adopting good practices while doing productive work and raising the confidence level in the software that they’re producing.

I really cannot fathom any actual disadvantage to this practice. To me, the “obviously” factor of this is now on par with whether or not wearing a seat belt is a good idea.

By

Inverting Control

I imagine that inversion of control is a relatively popular concept to talk or blog about, particularly in object-oriented circles, so rather than do a garden-variety explanation of the term followed by a pitch for using it, I thought I’d take a slightly different approach. I’m going to talk about the reason that there is often resistance to the concept of inversion of control, why that resistance is understandable, and why that understandable resistance is misguided. This involves a foray into the history of programming and general concepts of human reasoning.

In the Beginning

Nobody starts out writing object oriented programs, nor do they start out understanding concepts like lambda expressions, function evaluation, etc. People start out, almost invariably, with the equivalent of batch scripts. That is, they learn how to automate small procedures and they build from there. This is a natural and understandable progression in terms of individual learning and also in terms of our journey as a programming collective. The earliest programs were sequences of instructions. They had abbreviated syntax, control structures, and goto statements that automated a task from beginning to end.

An example is something like the following (in pseudo code):

start:
file = "numbers.txt"
if not file exists
touch "numbers.txt"
goto exit
x = ""
open "numbers.txt" > x
exit:

The logic is simple and easy enough to follow. Look for a file called numbers.txt and create it if it doesn’t exist. Otherwise, read it. Now, as you want to add things to this program, it gets longer and probably harder to follow. There may be some notion of looping, handled with a loop construct, or, if sufficiently primitive in terms of time or level of code (i.e. if we’re at the chip level), with goto statements to process the contents of the file.

Procedural Code to the Rescue

As things evolved, the notion of subroutines was introduced to help alleviate complexity and make programs more readable and the concept of procedural or structural programming was born. I believe Dijkstra famously declared that the evolution of this paradigm should make it such that the goto statement was never used again. Structural/procedural programming involves creating abstractions of commonly used routines so that they can be reused and so that the program is more readable and less error prone.

//functions elided
int main(int argc, char* argv[])
{
    char* file;
    file = get_filename();
    if(!file_exists(file))
    {
        create_file();
        return 0;
    }
    read_file();
}

First off, pardon my C syntax. I did not compile this and haven’t written actual C code in a while. But, you get the idea. Here we have an implementation where details are mercifully abstracted away into functions. We want the name of the file, so we call “get_filename()” and let someone else handle it. We want to know if the file exists, so we abstract that as well. Same goes for creating or reading the file. The main routine is much more legible, and, better yet, other people can also call these methods, so you don’t need to copy and paste code or fix errors in multiple places if there are problems.

Procedural programming is powerful, and it can be used to produce very clean code and readable code. Many of those who use it do just that. (Though many also don’t and pride themselves instead on packing the most conceptual C functionality into a single line of hidden pointer arithmetic, tertiary operators and assignments in control structures, but I digress.) And because of its power and long history of success, it imprinted itself very clearly on the minds of people who used it for years and got used to its idioms and nuances.

Let’s think about how a procedural programmer tends to reason about code. That is, there is some main function, and that main function calls a sub-routine/function to handle some of its workload. It delegates to a more abstract function to handle things. Unlike the first example, as procedural code grows, it doesn’t get longer and harder to read. Instead, it grows into more files and libraries. The functions called by main are given their own functions to refer to, and the structure of the program grows like a tree, rather than a beanstalk to the heavens. Main is the root, and it branches out to the eventual leaves which are library functions.

Another way to think of this is command and control. Main is like the White House. It makes all of the big decisions and it delegates to the Cabinet for smaller but still important things. Each Cabinet member has his or her own department and makes all of the big decisions, delegating to underlings the details of smaller picture operations. This continues all the way down the chain of government until someone somewhere is telling someone how much bleach to use when cleaning the DMV floor. The President doesn’t care about this. It’s an inconsequential detail that should be handled without his intervention. Such is the structure of the procedural program as well. It mirrors life in that it mirrors a particular metaphor for accomplishing tasks – the command and control method of delegation.

The reason I go into all of this detail is that I want you to get inside the mind of someone who may be resistant to the concept of inversion of control. If you’re used to procedural programming and the command and control metaphor, then you’re probably nodding along with me. If you’re a dyed-in-the-wool OO programmer who uses Spring framework or some other IOC container, you’re probably getting ready to shout that your code isn’t the US government. That’s fine. We’ll get to that. But for now, think about your procedural-accustomed peer and realize that what you’re suggesting to him or her seems like the equivalent of asking the President of the US to run out to buy bleach for the guy at the DMV to clean the floor. It’s preposterous without the proper framing.

A New Way of Thinking

So, what is the proper framing? Well, after procedural code was well-established, the idea of object-oriented programming came into existence. On its face, this was a weird experiment, and there was no shortage of people that saw this as a passing fad. OOP flew completely in the face of established practice and, frankly, common sense. Instead of having procedures that delegated, we would now have objects that encapsulated properties and functionality. It sounds completely reasonable now, but this was a relatively revolutionary concept.

In the early stages of OOP, people did some things that were silly in retrospect. People got object-happy and in a rush to the latest, greatest thing, created objects for everything. No doubt there were for loop and while loop objects and someone had something like Conditional.If(x == 5).Then().Return(x); On the opposite end of the spectrum, there were some people who had been writing great software with procedural code for 20 years and they weren’t about to stop now, thank-you-very-much. And C++, the most popular early OOP language, put out places at the table for both camps. C++ allowed C programmers to write C and compile it with the C++ compiler, while it allowed OOP fanatics to pursue their weird idioms before eventually settling down into a good rhythm.  The publication of books about patterns and anti-patterns helped OOP fans continue their progress.

As these groups coexisted, the early-adopters blazed a trail and the late-adopters grudgingly adopted when they had to. The problem was in how they went about adopting. To a lot of people in this category, a “class” was roughly equivalent to a procedural library file with a group of associated functions. And a class can certainly serve this purpose, despite the point of OOP being completely missed. Anybody who has seen classes with names like “FileUtils” or “FinancialConversions” knows what I’m talking about. These are the calling cards of procedural programmers ordered to write in an object-oriented language without real introduction to object-oriented concepts.

Mixed Metaphors

So what? Well, the end-game here is that this OOP/procedural hybrid is firmly entrenched in countless legacy applications and even ones being developed today by people who learned procedural thinking and never really had that “Aha!” moment with object-oriented thinking. It isn’t simply that classes in these applications function as repositories for loosely related methods, but that the entire structure of the program follows the command and control metaphor in an object-oriented world.

And, what is the object-oriented world? I personally think a good metaphor for it is Legos. If you’re a kid with a bunch of Lego kits and parts at your disposal and you want a scene of a bunch of pirate ships or space ships doing battle, you build all of the little components for a ship first. Then, with the little components assembled, you snap them together into larger and larger components until you’ve built your first ship. Sometimes, you prototype a component and use it in multiple places. You then repeat this as necessary and assemble the ships into some grand imitation of adventure on the high seas. This is the fundamental nature of object-oriented programming. There is no concept of delegation in the command and control sense — quite the opposite — the details are assembled first into ever-larger pieces. An equally suitable and less childlike metaphor may be the construction of a building.

As a procedural “President,” you would be ill at ease in this world. You stand there, demanding that non-existent ships assemble themselves by having their hulls assemble themselves by having their internal pieces assemble themselves. You’re yelling a lot, but nothing’s happening and there is, actually, nobody or no thing there to yell at.

Of course, the procedural folks aren’t actually that daft, so what they do instead is force the Lego world to be a command and control world. They lay out the ship’s design and architecture, but they also add a functionality to the ship where it constructs its own internals. That is to say, they start with small stuff like our object-oriented folk, but the small stuff is all designed to give the big things the ability to create the small things themselves. They do this at every level, giving every object extra responsibility and self-awareness so that at the end, they can have a neat, clean soapbox from which to say:

int main(int argc, char* argv[])
{
    Ship ship1 = new Ship();
    ship1.Build();
    Ship2 = new Ship();
    ship2.Build();
    ....
}

No fuss, no muss (after all the setup overhead of teaching your ships how to build themselves). You simply declare a ship, and tell it to build itself, at which point the ship creates a hull, which it tells to build itself, and so on down the line.

I realize that this sounds insane, probably even to those procedural programmers out there. But it only sounds that way because I’ve changed the metaphor. It made a lot more sense when the President was telling someone to have his people call their people. And, with that metaphor, the object-oriented approach sounded insane, as we’ve covered with the President buying bleach at the corner store for a janitor to clean a floor somewhere.

Getting it Right

So, back to inversion of control (IOC). IOC only makes sense in the Lego metaphor. If you eventually want to build a pirate ship, you start to think about what you need for a pirate ship. Well, you need a crow’s nest and a hull and a plank, and — well, stop right there. We’re getting ahead of ourselves. Before we can build a ship, we clearly need some other stuff. So, let’s focus on the crow’s nest. Well, we need some pieces of wood and that’s about it, so maybe that’s a good place to start. We’ll think about the properties of a piece of wood and define it. It’s probably got three spatial dimensions and a weight and a type, so we can start there. Once we’ve defined and described our wood, we can figure out how to assemble it into a crow’s nest, and so on.

Object-oriented programming is about creating small components to be assembled into larger components. And, inversion of control follows naturally from there — much more naturally than command and control. And it should follow into your programming.

If you’re creating a presentation tier class that drives some form in your UI, one of the first things that’s going to occur to you is that you need to obtain some data from somewhere. In the procedural world, you would say, “Aha! I need to define some service class that goes out and gets that data so I’ll have my presentation tier class instantiate this service and…” But stop. You don’t want to do that. You want to think in the Lego metaphor. As soon as you need something that you don’t have, it’s time to stop working on that class and move to the class that yours needs (if it doesn’t already exist).

But, before you leave, you want to document for posterity that your presentation tier class is useless without a service to provide data. What better way to do that than to make it impossible to create an instance of your class without passing in the service you need? That puts the handwriting on the wall, in 8000 point font, that your presentation tier class needs a service or it won’t work. Now you can go create the service, or find it if it exists, or ask the guy who works on the service layer to write it.

But where does the class calling yours get the service? Who cares. I say that with a period instead of a question mark because it’s declarative. That isn’t your problem, and it isn’t your presentation tier class’s problem. And the reason for that is that your pirate ship isn’t tasked with building its own hull and crow’s nest. Somebody else builds those things and hands them over for ship construction when they’re completed.

Back to the Code

That was a long and drawn out journey, but I think it’s important to drive home the different metaphors and their implications for the sake of this discussion. Without that, procedural and OOP folks are talking at each other and not understanding one another. If you’re trying to sell IOC to someone who isn’t buying it, you’re much better served understanding their thinking. And if you’re one of those resisting buying it, it’s time to get out your wallet because the debate is over. IOC produces cleaner code that is easier to maintain, test, and extend. Procedural coding has its uses, but if you’re already using an OO language, you don’t have a candidate for those uses.

So, what are the actual coding implications of inversion of control? I’ll enumerate some here to finish up.

1. Classes have fewer reasons to change

One of the guiding principles of clean code and keystone member of SOLID is the “Single Responsibility Principle.” Your classes should have only one reason to change because this makes them easier to reason about, and it makes changes to the code base produce less violence and upheaval that triggers regressions. If you use the procedural style to create classes, your classes will always have at least two reasons to change: (1) their actual function changes; and (2) the way they create their sub-components changes.

2. Classes are easier to unit test

If you’re looking to unit test a command and control pirate ship, think about everything that happens in the PirateShip’s constructor. It news up a hull, crow’s nest, etc, which all new up their internals, and so on, recursively down the structure of the application. You cannot unit test PirateShip at all. It encapsulates all of the functionality of your program. In fact, you can’t unit test anything except the tree leaves of functionality. Pretty much all of your tests are system/integration tests. If you invert control, it’s easy. You just pass in whatever you want to the class to poke it and prod it and see how it behaves.

3. No global variables or giant method signatures

Imagine that your crow’s nest wants to communicate with the ship’s rudder for some reason. These classes reside at the complete alpha and omega of the program and are tree leaves. In command and control style, you have two options. The first is to have all of the nodes in between the root and those leaves pass the message along in constructors or mutators. As the amount of overhead resulting from this gets increasingly absurd, most procedural programmers turn to option 2: the global variable (or, its gussied-up object-oriented counterpart loved by procedural programmers everywhere – the singleton). I’ll save that for another post, as it certainly deserves its own treatment in depth, but let’s just say, for argument’s sake and for the time being, that this is undesirable. Every class in the application doesn’t need to see the personal business of those two and how they communicate.

In the IOC model, this is a simple prospect. Because you’re building all of the sub-components and assembling them into increasingly large components, there is one place that has reference to everything. From the class performing all of that assembly, it’s trivial to link those two leaf nodes or, really, any classes. I can give one a reference to the other. I can create a third object to which both refer or give them both references to it. There are any number of options that don’t involve hideous method signatures or globals.

4. Changing the way things are constructed is easy

Speaking of having your assembly all in one place, swapping out components becomes simple. If I want the crow’s nest to use balsa wood instead of pine, I just go to the place that crow’s nest is instantiated and pass it something else. I don’t have to weed through my code base looking for the class and then trace the instantiation path to where I need it. All of the instantiation logic happens centrally. This makes it way easier to introduce conditions for using different types of construction logic that don’t clutter up your classes and that don’t care how their components are constructed. In fact, if you use Spring or some IOC container, you can abstract this out of your program altogether and have it reside in some configuration file, if you’re feeling particularly “enterprisey” (to borrow a term from an amusing site I like to visit).

5. Design by contract becomes easy as well

This is another thing to take at face value for the time being, but having your components interact via interface is much easier this way. Interface programming is a good idea in general. But, if you’re not inverting control, it’s kind of pointless. If all of your object creation is hard-coded throughout the application, interfacing is messy. If your PirateShip is creating a CrowsNest and you want a CrowsNest interface with command and control, you’d have to pass some enumeration (or use some global) into PirateShip to tell it what kind of CrowsNest to instantiate. This, along with some of the other examples, demonstrates the first point about code bloat and the SRP. As I’m introducing these new requirements (which generally happen sooner or later), our procedural classes get bigger, more complicated, and more difficult to maintain. Now they not only need to instantiate their components, but also make decisions about how to instantiate them based on additional variables that you need to put in them. And, I promise, there will be more.

Fin

So, I hope some of you reading find this worthwhile. I’m not much of a proselytizer for or true adherent to any one way of doing things. Procedural programming and other styles of programming have their places (I literally just used ‘goto’ today in writing a batch script for executing some command line utilities). But, if you’re in the OO world, using the OO metaphors and surrounded by OO people, it is clearly in your best interest to adapt instead of fight.

By

Static Analysis — Spell Check for Code

A lot of people have caught onto certain programming trends: some agility in the process generally makes things better, unit testing a code base tends to make it more reliable, etc. One thing that, in my experience, seems to lag behind in popularity is the use of static checking tools. If these are used at all, it’s usually for some reason such as enforcing capitalization schemes of variables or some other such thing that guarantees code that is uniform in appearance.

I think this non-use or under-use of such tools is a shame. I recently gave a presentation on using some tools in C# and Visual Studio 2010 for static analysis, and thought I’d share my experience with some of the tools and the benefits I perceive here. In this development environment, there are six tools that I use, all to varying degrees and for different purposes. They are:

Before I get into that, I’ll offer a little background on the idea of static analysis. The age-old and time-tested way to write code is that you write some code, you compile it, and then you run it. If it does what you expected when you run it, then you declare yourself victorious and move on. If it doesn’t, then you write some more code and repeat.

This is all fine and good until the program starts getting complicated — interacting with users, performing file I/O, making network requests, etc. At that point, you get a lot of scenarios. In fact, you get more scenarios than you could have anticipated in one sitting. You might run it and everything looks fine, but then you hand it to a user who runs it, unplugs the computer, and wonders why his data wasn’t saved.

At some point, you need to be able to reason about how components of your code would behave in various scenarios, even if you might not easily be able to recreate these scenarios. Unit testing is helpful with this, but unit testing is just an automated way of saying, “run the code.” Static analysis automates the process of reasoning about the code without running it. It’s like you looking at the code, but exponentially more efficient and much less likely to make mistakes.

Doing this static analysis is adding an extra step to your development process. Make no mistake about that. It’s like unit testing in that the largest objection is going to be the ‘extra’ time that it takes. But it’s also like unit testing in that it saves you time downstream because it makes defects less likely to come back to bite you later. These two tasks are also complimentary and not stand-ins for one another. Unit testing clarifies and solidifies requirements and forces you to reason about your code. Static analysis lets you know if that clarification and reasoning has caused you to do something that isn’t good.

As I said in the title, it’s like a spell checker for your code. It prevents you from making silly and embarrassing mistakes (and often costly ones). To continue the metaphor, unit testing is more like getting someone bright to read your document. He’ll catch some mistakes and give you important feedback for how to improve the document, but he isn’t a spell checker.

So, that said, I’ll describe briefly each one and why I use and endorse it.

MS Analysis

MS Analysis encapsulates FX Cop for the weightier version of Visual Studio 2010 (Premium and up, I think). It runs a series of checks, such as whether or not parameters are validated by methods, whether or not you’re throwing Exception instead of SomeSpecificException, and whether your classes have excessive coupling. There are probably a few hundred checks in all. When you do a build with this enabled, it populates the error list with violations in the form of warnings.

On the plus side, this integrates seamlessly with Visual Studio since it’s a Microsoft product, and it catches a lot of stuff. On the down side, it can be noisy, and customizing it isn’t particularly straightforward. You can turn rules on and off, but if you want to tweak existing ones or create your own, things get a little more complicated. It also isn’t especially granular. You configure it per project and can’t get more fine grained feedback than that (i.e. per namespace, class, or method).

My general use of it is to run it periodically to see if my code is violating any of the rules that I care about. I usually turn off a lot of rules, and I have a few different rulesets that I plug in and out so that I can do more directed searches.

Style Cop

StyleCop is designed to be run between writing and building. Instead of using the VM/Framework to reflect on your code, it just parses the source code file looking for stylistic concerns (are all of your fields camel cased and documented and are you still using Hungarian notation and, if so, stop) and very basic mistakes (like, do you have an empty method). It’s lightning fast, and it runs on a per-class basis, which is cool.

On the downside, it can be a little annoying and invasive, but the designers are obviously aware of this. I recall reading some kind of caveat stating that the nature of these types of rules tends to be arbitrary and get opinionated developers into shouting matches.

I find it useful for letting me know if I’ve forgotten to comment things, if I’ve left fields as anything other than private, and if I have extra parentheses somewhere. I run Style Cop occasionally, but not as often as others. Swapping between the rule sets is a little annoying.

CodeRush

CodeRush is awesome for a lot of things, and its static analysis is really an ancillary benefit. It maintains an “issues list” for each file and highlights these issues in real time, right in the IDE. A few of them are a little bizarre (suggesting to always use “var” keyword if it is possible), but most of them are actually really helpful and not suggested by the MS Tools or anything else I use. It does occasionally false flag dead code and get a few things wrong, but it’s fairly easy to configure it to ignore issues on a per file, per namespace, per solution basis.

The only real downside here is that CodeRush has a seat licensing cost and that and the other overhead of CodeRush make it a little overkill-ish if you’re just interested in Static Analysis. I fully endorse getting CodeRush in general, however, for all of its features.

Code Contracts

Like CodeRush, this tool is really intended for something else, and it provides static analysis as a side effect. Code Contracts is an academically developed tool that facilitates design by contract. Pre- and post-conditions as well as class invariants can be enforced at build time. Probably because of the nature of doing this, it also just so happens to offer a feature wherein you can have warning squigglies pop up anywhere you might be dereferencing null, violating array bounds, or making invalid arithmetic assumptions.

To me, this is awesome, and I don’t know of other tools that do this. The only downside is that, on larger projects, this takes a long time to run. However, getting an automatic check for null dereferences is worth the wait!

I use it explicitly for the three things I mentioned, though, if I get a chance and enough time, I’d like to explore its design by contract properties as well.

NDepend

NDepend is really kind of an architectural tool. It lets you make assessments about different dependencies in your code, and it provides you with all kinds of neat graphs that report on them. But my favorite feature of NDepend is the static analysis in the form of Code Querying. It exposes SQL-like semantics that let you write queries against your code base, such as “SELECT ALL Methods WHERE CyclomaticComplexity > 25” (paraphrase). You can tweak these things, write your own, or go with the ones out of the box. They’re all commented, too, in ways that are helpful for understanding and modifying.

There is really no downside to NDepend aside from the fact that it costs some money. But if you have the money to spare, I highly recommend it. I use this all the time for querying my code bases in whatever manner strikes my fancy.

Nitriq

I think that Nitriq and NDepend are probably competitors, but I won’t do any kind of comparison evaluation because I only have the free version of Nitriq. Nitriq has the same kind of querying paradigm as NDepend, except that it uses LINQ semantics instead of SQL. That’s probably a little more C# developer friendly, as I suppose not everyone that writes C# code knows SQL (although it strikes me that all programmers ought to have at least passing familiarity with SQL).

In the free version you can only assess one assembly at a time, though I don’t consider that a weakness. I use Nitriq a lot less than NDepend, but when I do fire it up, the interface is a little less cluttered and it’s perhaps a bit more intuitive. Though, for all I know, the paid version may get complicated.

Conclusion

So, that’s my pitch for static analysis. The tools are out there, and I suspect that they’re only going to become more and more common. If this is new to you, check these tools out and try them! If you’re familiar with static analysis, hopefully there’s something here that’s new and worth investigating.

By

Addicted to Unit Testing

Something interesting occurred to me the other day when I posted sample code for a DXCore plugin that I created. In the code that I uploaded, I added a unit test project with a few unit tests as a matter of course. Apparently, the process of unit testing has become so ingrained in me that I didn’t think anything of it until later. This caused me to reflect a bit on my relationship, as a developer, to unit testing.

I’ve worked in settings where unit tests have been everything from mandated to tolerated to scoffed at and discouraged. And I’ve found that I do some form of unit testing in all of these environments. In environments that require unit testing, I simply include them the way everyone else does, conforming to standards. In environments that discourage unit testing, I implement them and keep them to myself. In that case, the tests aren’t always conventional or pretty, but I always find some way to automate verification. To me, this is in my blood as a programmer. I want to automate everything and I don’t see why verifying my code’s functionality should be any different.

But I realize that it’s now something beyond just a desire to automate, tinker, and take pride in my work. The title of this post is tongue-in-cheek, but also appropriate. When I write code and don’t do some verification of it, I start to feel edgy and off of my game. It’s hard for me to understand how people can function without unit testing their code. How do they know it works? How do they know they’ve handled edge cases and bad input to methods? How do they feel comfortable building new classes that depend on the correct functioning of the ones that came first? And, most importantly, how do they know they’re not running in place when adding functionality — breaking an existing requirement for each new one they satisfy?

I exhibit some of the classic signs of an addict. I become paranoid and discombobulated without the thing upon which I depend. I’ll go to various lengths to test my code, even if it’s not factored into the implementation time (work longer hours, create external structure, find free tools, etc.). I reach for it without really thinking about it, as evidenced by the unit tests in my uploaded code.

But I suppose the metaphor ends there. Because unlike the vices about which recovering addicts might speak this way — drugs, alcohol, gambling, etc. — I believe this ‘addiction’ makes me better as a software engineer. I think most people would agree that tested code is likely to be better code, and strong unit test proponents would probably argue that the habit makes you write better code whether or not you unit test a given class at a given time. For example, when I create a dependency-injected class, the first code I tend to write, automatically and out of habit, is an “if” statement in the constructor that checks the injected references for null and throws an exception if they are. I write this because the first unit test that I write is one to check how the class behaves when injected with null.

And, to me, that’s really the most core benefit of unit testing. Sure, it makes refactoring easier, it enumerates your requirements better than a requirements analysis document could, it inspires confidence in the class behavior, and all of the other classic properties of unit testing as a process stalwart. But I think that, at the core, it changes the way you think about classes and how you write code. You write code knowing that it should behave in a certain way with certain inputs and that it should provide certain outputs. You think in terms of what properties your classes have and what they should initialize to in different instantiation scenarios. You think of your classes as units, and isn’t that the ultimate goal — neat, decoupled code?

One common theme that I see in code and I’ve come to think of as highly indicative of a non-unit testing mentality is a collection of classes that distribute their functionality in ad-hoc fashion. That is, some new requirement comes in and somebody writes a class to fulfill it — call it “Foo.” Then a few more requirement riders trickle in as follow ups, and since “Foo” is the thing that satisfies the original, it should also satisfy the new ones. It’s perfectly fine to call it “Foo” because it represents a series of user requests and not a conceptual object. Some time later, more requirements come in, and suddenly the developer needs two different Foos to handle two different scenarios. Since Foo is getting kind of large, and large classes are bad, the solution is to create a “FooManager” that knows all about the internals of “Foo” and to spread functionality across both. If FooManager needs internal Foo field “_bar”, “_bar” is made into a property “Bar” (C#) or an accessor “GetBar()” (Java/C++), and the logic proceeds. Foo does some things to its former private member “Bar” and then “FooManager” also does some things to Foo’s Bar, and before you know it, you have a Gordian knot of functional and temporal coupling.

I don’t think this kind of code would ever exist in a world populated only by unit testing addicts. Unit testing addiction forces one to consider upfront a class’s reason for existing and carefully deliberate its public interface. The unit testing addict would not find himself in this precarious situation. His addiction would save him from this “rock bottom” of software development.