DaedTech

Stories about Software

By

The Outhouse Anti Pattern

Source Control Smells?

I was reading through some code the other day, and I came across a class that made me shutter a little. Interestingly, it wasn’t a code smell or a name smell. I wasn’t looking at the class itself, and it had an innocent enough name. It was a source control smell.

I was looking through source control for something unrelated, and for some reason this class caught my eye. I took a look at its version history and saw that it had been introduced and quickly changed dozens of times by several people. Even assuming that you like to check code in frequently, it’ll take you what, one or two checkins to write a class, and then maybe a few more to tweak its interface or optimize its internals, if need be? Why were there all these cooks in this kitchen, and what were they doing in there? The changes to the class continued like that over the course of time, with periods of few changes, and then bouts of frenzied, distributed activity. How strange.

I opened the class itself to take a look, and the source control checkin pattern immediately made sense to me. The class was a giant static class that did nothing but instantiate ICommand implementations and expose them as public fields. I haven’t tagged this class with C# for a reason — this anti-pattern transcends languages — so, I’ll briefly explain for non C# savvy readers. ICommand is an interface that defines Execute and CanExecute methods – sort of like a command pattern command, but without forcing implementation of UnExecute() or some such thing.

So, what was going on here was that this class was an already-large and robustly growing dumping repository for user actions in the application. Each property in the class was an instance of an ICommand implementation that allowed specifying delegates for the Execute and CanExecute methods, so they looked something like this:


public class ApplicationCommands
{
  public static readonly ICommand FooComand = new Command(Execute, CanExecute); //Command accepts delegate xtor params

  public static void Execute()
  {
   //Do some foo thing, perhaps with shared state (read: global variables) in the class
  }

  public static bool CanExecute()
  {
    //return a boolean based on some shared state (read: global variables) in the class
  }

 //Rinse, repeat, dozens and dozens, and probably eventually hundreds and thousands of times.
}

So, the ‘pattern’ here is that if you want to define some new user action, you open this class and find whatever shared global state will inform your command, and then add a new command instance field here.

A Suitable Metaphor

I try to stay away from the vulgar in my posting, so I beg forgiveness in advance if I offend, but this reminds me most of an outhouse at a campground. It’s a very public place that a lot of people are forced to stop by and… do their business. The only real reason to use it is that a some people would probably be upset and offended if they caught you not using it (e.g. whoever wrote it), as there is no immediately obvious benefit provided by the outhouse beyond generally approved decorum. Like the outhouse, over time it tends to smell worse and worse and degenerate noticeably. If you’re at all sensitive to smells, you probably just go in, hold your nose, and get out as quickly as possible. And, like the high demand outhouse, there are definitely “merge conflicts” as people are not only forced to hold their noses, but to wait in line for the ‘opportunity’ to do so.

There are some periodic attempts to keep the outhouse clean, but this is very unpleasant for whoever is tasked with doing so. And, in spite of the good intentions of this person, the smell doesn’t exactly go away. Basic etiquette emerges in the context of futile attempts to mitigate the unpleasantness, such as signs encouraging people to close the lid or comments in the code asking people to alphabetize, but people are generally in such a hurry to escape that these go largely unheeded.

At the end of the day, the only thing that’s going to stop the area from being a degenerative cesspool is the removal of the outhouse altogether.

Jeez, What’s So Bad About This

The actual equivalence of my metaphor is exaggerating my tone a bit, but I certainly think of this as an anti-pattern. Here’s why.

  1. What is this class’s single responsibility? The Single Responsibility Principle (SRP) wisely advises that a class should have exactly one reason to change. This class has as many reasons as it has commands. And, you don’t get to cop out by claiming that its single responsibility is to define and store information about everything a user might ever want to do (see God Class).
  2. This class is designed to be modified each and every time the GUI changes. That is, this thing will certainly be touched with every release, and it’ll probably be touched with every patch. The pattern here is “new functionality — remember to modify ApplicationCommands.cs”. Not only is that a gratuitous violation of (even complete reversal of) the Open/Closed Principle, but if you add a few more outhouses to the code base, people won’t even be able to keep track of all the changes that are needed to alter some small functionality of the software.
  3. Dependency inversion is clearly discouraged. While someone could always access the static property and inject it into another class, the entire purpose of making them public fields in a static class is to encourage inline use of them as global ‘constants’, from anywhere. Client code is not marshaling all of these at startup and passing them around, but rather simply using them in ad-hoc fashion wherever they are needed.
  4. Besides violating S, O, and D of SOLID, what if I want a different command depending on who is logged in? For example, let’s say that there is some Exit command that exits the application. Perhaps I want Exit command to exit if read-only user is logged in, but prompt with “are you sure” for read-write user. Because of the static nature of the property accessors here, I have to define two separate commands and switch over them depending on which user is logged in. Imagine how nasty this gets if there are many kinds of users with the need for many different commands. The rate of bloat of this class just skyrocketed from linear with functionality to proportional to (F*R*N) where F is functionality items, R is number of roles and N is nuance of behavior in roles.
  5. And, the rate of bloat of the client classes just went up a lot too, as they’ll need to figure out who is logged in and thus which of the commands to request.

  6. And, of course, there is the logistical problem that I noticed in the first place. This class is designed to force multiple developers to change it, often simultaneously. That’s a lot like designing a traffic light that’s sometimes green for everyone.

Alternative Design

So, how to work toward a more SOLID, flexible design? For a first step, I would make this an instance class, and hide the delegate methods. This would force clients to be decoupled from the definition, except in one main place. The second step would be to factor the commands with the more complex delegates into their own implementations of ICommand and thus their own classes. For simpler classes, these could simply be instantiated inline with the delegate methods collapsed to lambda expressions. Once everything had been factored into a dedicated class or flagged as declarable inline, the logic for generating these things could be moved into some sort of instance factory class that would create the various kinds of command. Now, we’re getting somewhere. The command classes have one reason to change (the command changes) and the factory has one reason to change (the logic for mapping requests to command instances changes). So, we’re looking better on SRP. The Open/Closed principle is looking more attainable too since the only class that won’t be closed for modification is the factory class itself as new functionality is added. Dependency inversion is looking up too, since we’re now invoking a factory that provides ICommand instances, so clients are not going out and searching for specific concrete instances on their own.

The one wrinkle is the shared global state from the previous commands static class. This would have to be passed into the factory and injected into the commands, as needed. If that starts to look daunting, don’t be discouraged. Sure, passing 20 or 30 classes into your factory looks really ugly, but the ugliness was there all along — you were just hiding it in the outhouse. Now that the outhouse is gone, you’ve got a bit of work to do shoveling the… you get the idea. Point is, at least you’re aware of these dependencies now and can create a strategy for managing them and getting them into your commands.

If necessary, this can be taken even further. To satisfy complaint (4), if that is really an issue, the abstract factory can be used. Alternatively, one could use reflection to create a store of all instances of ICommand in the application, keyed by their names. This is a convention over configuration solution that would allow clients to access commands by knowing only their names at runtime, and not anything about them beyond their interface definition. In this scheme, Open/Closed is followed religiously as no factory exists to modify with each new command. One can add functionality by changing a line of markup in the GUI and adding a new class. (I would advise against this strategy for the role/nuance based command scheme). Another alternative would be to read the commands in from meta-data, depending on the simplicity or complexity of the command semantics and how easily stored as data they could be. This is, perhaps the most flexible solution of all, though it may be too flexible and problematic if the commands are somewhat involved.

These are just ideas for directions of solution. There is no magic bullet. I’d say the important thing to do, in general, is recognize when you find yourself building an outhouse, and be aware of what problems it’s going to cause you. From there, my suggestions are all well and good in the abstract, but it will be up to you to determine how to proceed in transitioning to a better design.

By

Writing Maintainable Code Demands Creativity

Writing maintainable code is for “Code Monkeys”?

This morning, I read an interesting blog post from Erik McClure. The premise of the post is that writing maintainable code is sufficiently boring and frustrating as to discourage people from the programming profession. A few excerpts from the post are:

There is an endless stream of rheteric discussing how dangerous writing clever code is, and how good programmers don’t write clever code, or fast code, but maintainable code, that can’t have clever for loop statements or fancy tricks. This is wrong – good codemonkeys write maintainable code.

and

What I noticed was that it only became intolerable when I was writing horrendously boring, maintainable code (like unit tests). When I was programming things I wasn’t supposed to be programming, and solving problems in ways I’m not supposed to, my creativity had found its outlet. Programming can not only be fun, but real programming is, itself, an art, a solution to a problem that itself embodies a certain elegance. Society in general seems to be forgetting what makes us human in the wake of a digital revolution that automatizes menial tasks. Your creativity is the most valuable thing you have.

When you write a function that does things the right way, when you refactor a clever subroutine to something that conforms to standards, you are tearing the soul out of your code. Bit by bit, you forget who you are and what the code means. The standards may enforce maintainable and well-behaved code, but it does so at the cost of your individuality. Coding becomes middle-school math, where you have to solve the same, reworded problem a hundred times to get a good grade so you can go do something else that’s actually useful. It becomes a means to an end, not an adventure in and of itself.

Conformity for the Sake of Maintainability

There seem to be a few different themes here. The first one I see is one with which I have struggled myself in the past: chafing at being forced to change the way you do things to conform to a group convention. I touched on that here and here. The reason that I mention this is the references to “[conforming] to standards” and the apparent justification of those standards being that they make code “maintainable”. The realpolitik of this situation is such that it doesn’t really matter what justification is cited (appeal to maintainability, appeal to convention, appeal to anonymous authority, appeal to named authority, appeal to threat, etc). In the end, it boils down to “because I said so”. I mention this only insofar as I will dismiss this theme as not having much to do with maintainability itself. Whatever Erik was forced to do may or may not have actually had any bearing whatsoever on the maintainability of the code (i.e. “maintainability” could have been code for “I just don’t like what you did, but I should have an official sounding reason”).

Maintainable Code is Boring Code

So, on to the second theme, which is that writing maintainable code is boring. In particular Erik mentions unit tests, but I’d hazard a guess that he might also be talking about small methods, SRP classes, and other clean coding principles. And, I actually agree with him to an extent. Writing code like that is uneventful in some ways that people might not be used to.

That is, say that you don’t perform unit tests, and you write large, coupled, complex classes and methods. When you fire up your application for the first time after a few hours of coding, that’s pretty exciting. You have no idea what it’s going to do, though the smart money is on “crash badly”. But, if it doesn’t, and it runs, that’s a heady feeling and a rush — like winning $100 in the lottery. The work is also interesting because you’re going to be spending lots of time in the debugger, writing bunches of local variables down on paper to keep them straight. Keeping track of all of the strands of your code requires full concentration, and there’s a feeling of incredible accomplishment when you finally track down that needle in-a-haystack bug after an all-nighter.

On the flip side, someone who writes a lot of tests and conforms to the clean code/craftsmanship mantra has a less exciting life. If you truly practice TDD, the first time you fire up the application, you already know it’s going to work. The lottery-game excitement of longshot odds with high payoff is replaced by a dependable salary. And, as for the maddening all-nighter bugs, those too are gone. You can pretty much reproduce a problem immediately, and solve it just as quickly with an additional failing test that you make pass. The underdog, down by a lot all game, followed by miracle comeback is replaced by a game where you’re winning handily from wire to wire. All of the roller-coaster highs and lows with their panicked all nighters and miracle finishes are replaced by you coming in at 9, leaving at 5, and shipping software on time or ahead of schedule.

Making Code Maintainable Is Brainless

The third main theme that I saw was the idea that writing clever code and maintainable code is mutually exclusive, and that the latter is brainless. Presumably, this is colored to some degree by the first theme, but on its own, the implication is that maintainable code is maintainable because it is obtuse and insufficient to solve the problem. That is, instead of actually solving problems that they’re tasked with, the maintainable-focused drones oversimplify the problem and settle for meeting most, if not all of the requirements of the software.

I say this because of Erik’s vehement disagreement with the adage that roughly says “clever code is bad code”. I’ve seen this pithy expression explained in more detail by people like Uncle Bob (Robert Martin) and I know that it requires further explanation because it actually sounds discouraging and draconian stated simply. (Though, I think this is the intended, provocative effect to make the reader/listener demand an explanation). But, taken at face value I would agree with Erik. I don’t relish the thought of being paid a wage to keep quiet and do stupid things.

Maintainability Reconsidered

Let’s pull back for a moment and consider the nature of software and software development. In his post, Erik bleakly points out that software “automatizes[sic] menial tasks”. I agree with his take, but with a much more optimistic spin — I’m in the business of freeing people from drudgery. But, either way, there can be little debate that the vast majority of software development is about automating tasks — even game development, which could be said to automate pretend playground games, philosophically (kids don’t need to play “cops and robbers” on the playground using sticks and other makeshift toys when games about detectives and mobsters automate this for them).

And, as we automate tasks, what we’re doing is taking tasks that have some degree of intrinsic complexity and falling on the grenade of that complexity so that completing the task is simple, intuitive, and even pleasant (enter user experience and graphic design) for prospective users. So, as developers, we deal with complexity so that end users don’t have to. We take complicated processes and model them, and simplify them without oversimplifying them. This is at the heart of software development and it’s a task so complex that all manner of methodologies, philosophies, technologies, and frameworks have been invented in an attempt to get it right. We make the complex simple for our non-technical end users.

Back in the days when software solved simpler problems than it does now, things were pretty straightforward. There were developers and there were users. Users didn’t care what the code looked like internally, and developers generally operated alone or perhaps in small teams for any given task. In this day and age, end-users still don’t care what the code looks like, but development teams are large, sometimes enormous, and often distributed geographically, temporally, and according to specialty. You no longer have a couple of people that bang out all necessary code for a project. You have library writers, database people, application coders, graphic designers, maintenance programmers etc.

With this complexity, an interesting paradigm has emerged. End-users are further removed, and you have other, technical users as well. If you’re writing a library or an IDE plugin, your users are other programmers. If you’re writing any code, the maintenance programmer that will come along later is one of your users. If you’re an application developer, a graphic designer is one of your users. Sure, there are still end-users, but there are more stakeholders now to consider than there used to be.

In light of this development, writing code that is hard to maintain and declaring that this is just how you do things is a lot like writing a piece of code with an awful user interface and saying to your users “what do you want from me — it works, doesn’t it?” You’re correct, and you’re destined to go out of business. If I have a programmer on my team who consistently and proudly writes code that only he understands and only he can decipher, I’m hoping that he’s on another team as soon as possible. Because the fact of the matter is that anybody can write code that meets the requirements, but only a creative, intelligent person can do it in such a way that it’s quickly understood by others without compromising the correctness and performance of the solution.

Creativity, Cleverness and Maintainability

Let’s say that I’m working and I find code that sorts elements in a list using bubble sort, which is conceptually quite simple. I decide that I want to optimize, so I implement quick sort, which is more complex. One might argue that I’m being much more clever, because quick sort is a more elegant solution. But, quicksort is harder for a maintenance programmer to grasp. So, is the solution to leave bubble sort in place for maintainability? Clearly not, and if someone told Erik to do that, I understand and empathize with his bleak outlook. But then, the solution also isn’t just to slap quicksort in and call it a day either. The solution is to take the initial implementation and break it out into methods that wrap the various control structures and have descriptive names. The solution is to eliminate one method with a bunch of incrementers and decrementers in favor of several with more clearly defined scope and purpose. The solution is, in essence, to teach the maintenance programmer quicksort by making your quicksort code so obvious and so readable that even the daft could grasp it.

That is not easy to do. It requires creativity. It requires knowing the subject matter not just well enough to get it right through trial and error, not just well enough to know it cold and not just well enough to explain it to the brightest pupil, but well enough that your explanation shines through the code and is unmistakable. In other words, it requires creativity, mastery, intelligence, and clarity of thought.

And, when you do things this way, unit tests and other ‘boring’ code become less boring. They let you radically alter the internal mechanism of an algorithm without changing its correctness. They let you conduct time trials on the various components as you go to ensure that you’re not sacrificing performance. And, they document further how to use your code and clarify its purpose. They’re no longer superfluous impositions but tools in your arsenal for solving problems and excelling at what you do. With a suite of unit tests and refactored code, you’re able to go from bubble sort to quick sort knowing that you’ll get immediate feedback if something goes wrong, allowing you to focus exclusively on a slick algorithm. They’ll even allow you to go off tilting at tantalizing windmills like an unprecedented linear time sorting — hey, if the tests run in N time and all go green, you’re up for some awards and speaking engagements. For all that cleverness, you ought to get something.

So Why is “Cleverness” Bad?

What do they mean that cleverness is bad, anyway? Why say something like that? The aforementioned Bob Martin, in a video presentation I once watched, said something like “you know you’re looking at good code when someone reading it is saying ‘uh-huh’, ‘uh-huh’, ‘yep’, ‘makes sense’, ‘uh-huh'”. Contrast this with code that you see where your first reaction is “What on Earth…?!?” That is often the reaction to non-functional or incorrect code, but it is just as frequently the reaction to ‘clever’ code.

The people who believe in the idea of avoiding ‘clever’ code are talking about Rube-Goldbergian code, generally employed to work around some hole in their knowledge. This might refer to someone who defines 500 functions containing 1 through 500 if statements because he isn’t aware of the existence of a “for” loop. It may be someone who defines and heavily uses something called IndexedList because he doesn’t know what an array is. It may be this, this, this or this. I’ve seen this in the wild, where I was looking at someone’s code and I said to myself, “for someone who didn’t know what a class is, this is a fairly clever oddball attempt to replicate the concept.”

The term ‘clever’ is very much tongue-in-cheek in the context of clean coding and programming wisdom. It invariably means a quixotic, inferior solution to a problem that has already been solved and whose solution is common knowledge, or it means a needlessly complex, probably flawed way of doing something new. Generally speaking, the only person who thinks that it is actually clever, sans quotes, is the person who did it and is proud of it. If someone is truly breaking new ground, that solution won’t be described as ‘clever’ in this sense, but probably as “innovative” or “ground-breaking” or “impressive”. Avoiding ‘clever’ implementations is about avoiding pointless hacks and bad reinventions of the wheel — not avoiding using one’s brain. If I were coining the phrase, I’d probably opt for “cute” over “clever”, but I’m a little late to the party. Don’t be cute — put the considerable brainpower required for elegant problem solving to problems that are worth solving and that haven’t already been solved.

By

“Classworthiness” and Poor Cohesion

Finding Seams in Code

The other day, someone asked me about how to “get at” certain things that he wanted to unit test in a class that he was writing. That is, he had some code that he wanted to extract from a public method and test, but his only solution was to expose the new helper method as public in order to do this. My response to this strategy is that it doesn’t make sense to alter the public interface of your class to accommodate testing. Your class has some ‘natural’ set of inputs and outputs, and those are what you want to test – elevating visibility, reflection, and other trickery is not the way to test the class, since that’s not how it will be used, eventually.

When asked what I would do instead, I replied that for simple helper methods, I just let the tests of their callers stand. For more complex ones, I pull functionality out into a new class and give the old class a private reference to the new class. This is where I got an interesting response — he didn’t think that there was “enough” to justify a new class.

I asked what constituted “enough”, and prefaced the question by saying that his was a common and understandable sentiment. Over the course of time, I’ve grown used to people having certain heuristics for when code is “classworthy”. Perhaps if it’s somewhere between 3 methods and 20 methods. Or, maybe it’s a matter of lines of code — 50 lines is too little, but 1000 is too many. With enough of these heuristics in place, we could say that all methods should be 20 lines and all classes should be 10 methods and therefore 200 lines.

His response was what people’s response always seems to be (paraphrased): I can’t describe “classworthiness”, but I know it when I see it, and this isn’t it. I countered by offering a different kind of heuristic for “classworthiness”, but I’ll get to that later.

Cleaving Classes in Two

The next day, I found myself looking at the following code structure:

public class MyClass : IDoSomething, IDoSomethingUnrelated
{
    #region IDoSoemthingStuff
    //IDoSomething fields
    //IDoSomething properties
    //IDoSomething methods

    #endregion

    #region IDoSomethingUnrelated
    //IDoSomethingUnrelated fields
    //IDoSomethingUnrelated properties
    //IDoSomethingUnrelated methods

    #endregion
}

What we had here was two classes inexplicably crammed into a single class. But, that wasn’t what was concerning me (though I wasn’t thrilled about it). My task was to add the IDoSomethingUnrelated interface implementation to the following:

public class MySecondClass : IDoSomething
{
    #region IDoSoemthingStuff
    //IDoSomething fields
    //IDoSomething properties
    //IDoSomething methods

    #endregion
}

My first thought was the same thing that I hope yours is. Since there were two completely separate concerns in the first class, I would simply extract one of them to its own class and take it from there. Both classes still had to implement both interfaces (this was a non-negotiable constraint imposed on me), but at least I wouldn’t need to duplicate code, and I could make the internals of the class cleaner. The Boy Scout Rule for clean coding is an excellent one to follow.

However, there was a snag. Over the course of time, a handful of the fields had bled between the two classes masquerading as one. Not enough to give you the sense that it was anything but an accident and certainly not enough to make the concerns even remotely cohesive. It appeared to have happened by accident – expediency, one flag serving two purposes, laziness, that kind of thing.

As I was scowling at this and realizing that my exorcism of the possessed class was going to be more involved than I thought, a relationship between this task and my earlier conversation struck me.

A Root Cause of Poor Cohesion

So, how did this state of affairs come to be? Surely nobody looked at this code and thought, as they added the most recent handful of lines, “this is clean and awesome — go me!” Instead, somebody probably thought, “yuck, this is a pretty big class, but I’ll hold my nose and pile on just a touch” (the class was in the 1500 line neighborhood). Was I the first boy scout to get here and say “enough!” Apparently, I was, or it wouldn’t have looked like that.

But, surely, someone else must have considered it. My guess is that they considered extracting a class, but that they decided against it when they found, as I had, that the mess wasn’t quite as easy to detangle as it appeared. This class had become just cohesive enough that refactoring was a pain, but not cohesive enough to be considered anything other than “almost completely lacking cohesion”.

It wasn’t always like that. At some point, this thing was easily separable (if at no other time, at least when it was mashed together initially). But, between then and now, regioning was created, and fields/flags were carelessly and haphazardly re-appropriated depending on context. So, the real question is why someone didn’t see this as two separate classes right from the get-go. I traced the history of it back some and saw that when the two-classes-as-one paradigm was introduced, the class was relatively small in size compared to the rest of the code base. It has since grown steadily.

To think of it another way, when the class was first conceived, neither of its main responsibility was “classworthy” on its own. Not enough methods in the interfaces, or not enough lines of code, or something. Better to reduce the class footprint and combine them, and we can always separate them when the chimera gets too big. Of course, the best laid plans of mice and men… the person who thought that was probably not the same person that started introducing unnecessary cross-coupling, and neither of those people was me, looking at this class thinking that it should be two classes, and realizing that performing the separation surgery wasn’t going to be as easy as one might think.

Getting “Classworthy” Wrong

To return to my conversation about when to break out a new class, I responded that, to me, “classworthiness” had nothing to do with the number of methods, properties or fields, nothing to do with too many or too few lines of code, and nothing to do with how many classes exist in the solution at the moment. These are all red herrings as far as the decision as to what a class should be. A concept is “classworthy” to me if it encapsulates behavior and (usually) state, does one thing and does that thing well, and has only one reason to change (see Single Responsibility Princple).

If you read about the SRP, you’ll notice that there’s nothing in there about how many lines of code or how many methods are in the class. Don’t get me wrong — I’m very leery of classes that are 500+ lines of code and have 20+ methods, but this is a symptom rather than the illness. Those classes are (usually) bad not because of any number, but because something with that much code and that many methods generally isn’t focused and cohesive — it’s generally not doing one thing and doing it well. It’s a Swiss Army Knife Class.


(Image from Derick Bailey at Los Techies).

I’ve never really understood the reluctance to create a new class. It’s as if they were a limited resource like oil, and people want to conserve them. This isn’t limited to the person that I was talking to — it’s fairly pervasive, in my experience. I do understand that you can take my sentiment to an absurd extreme and create a class for every line of code or some such thing (though this would result in different metrics problems by raising coupling), but that isn’t what I’m talking about. I’m talking about reluctance to factor out a class because it will only have 70 lines of code or it will only have two methods.

If you find yourself thinking this way, pull back for a minute and ask yourself whether the code you’re considering pulling out is a separate behavior or not. If you have a class that receives data transfer objects from the GUI and validates them, the “and” tells you that you are, in fact, doing two separate things. Perhaps the validation logic just checks two fields to see if they’re empty or not. Is it really worth a separate class for that?

Yes! Why not decouple validation from staging? You’re not going to deplete the world’s class reserves with that one class creation. But, what you will get is a scheme where the validation of your object is decoupled from the staging of that object and where, if one of those things changes, the other will not be affected. And, to tie this all back together, you also get a nice, new seam for unit testing.

By

Making The Bad Impossible

Don’t Generate Compiler Warnings

I was doing some work this morning when I received a general email to a group of developers for a large project that sort of pleaded with and sort of demanded that members of the group not check in code that generated compiler warnings. And, I agree with that. Although, with the way that particular project is structured, this is easier said than done in some cases. There are dozens and dozens of assemblies in it, not all of which are re-compiled with a build, so someone may generate a warning, not notice it, and then not see it in subsequent runs if that assembly isn’t recompiled. Doing a rebuild instead of a build as habit is impractical here.

Something bothered me about this request and I couldn’t put my finger on it. I mean, the request is reasonable and correct. But, the email seemed sort of inappropriate in the way that talking about one’s salary or personal politics seems sort of inappropriate. It made me uncomfortable. The “why” of this kicked in after a few minutes. The email made me uncomfortable because it should have no need to exist. No email like that should ever be sent.

Don’t Do X

Some time back, I blogged about tribal knowledge, and why I view it as a bad thing. Tribal knowledge is a series of things that accrue over time in a project or group that are not manifest and can only be learned and memorized by experienced team members:

You ask for a code review, and a department hero takes one look at your code and freaks out at you. “You can never call MakeABar before MakeABaz!!! If you do that, the application will crash, the computer will probably turn off, and you might just flood the Eastern Seaboard!”

Dully alarmed, you make a note of this and underline it, vowing never to create Bars before Bazes. You’ve been humbled, and you don’t know why. Thank goodness the Keeper of The Tribal Knowledge was there to prevent a disaster. Maybe someday you’ll be the Keeper of The Tribal Knowledge.

We’ve all experienced things like this when it comes to code. Sometimes, they’re particulars of the application in question, such as in the tribal knowledge examples. Sometimes, they’re matters of organizational or other aesthetics, such as alphabetizing variable declarations, or maybe some other scheme. Sometimes, they’re coding conventions or standards in general. But, they’re always things that a person or group decide upon and then enforce by asking, telling, demanding, cajoling, threatening, begging, shaming, etc.

As a project or collaborative effort in general proceeds, the number of these items tends to increase, and the attendant maintenance effort increases along with it. One rule is simple and easy to remember. When the number gets up to five, people sometimes start to forget. As it balloons from there, team members become increasingly incapable of using a mental checklist and the checklist finds its way to a spreadsheet, or, perhaps a wall:

When this happens, the time spent conforming to it starts to add up. There is the development time, and then the post-development checklist validation time. Now, theoretically, a few times through the checklist and developers start tending to write code that conforms to it, so the lost time is, perhaps, somewhat mitigated, but the check itself is still more or less required. If someone gets very good at compliance, but makes one typo or careless error, a public shaming will occur and re-up the amount of time being spent on The List.

And, The List probably grows and changes over time, so no one is ever really comfortable with it. The List isn’t just tribal knowledge – it’s “meta-tribal knowledge”.

Stop the Madness

Is The List really necessary? People who enjoy dealing with the DMV or other bureaucracies may think so, and most developers may resign themselves to this sort of thing as an occupational hazard, but why are we doing this? Why do we have to remember to do a whole bunch of things and not do a whole bunch of other things. We’re programmers – can’t we automate this?!?

I believe that the answer to that question is “absolutely, and you should.” To circle back to my introductory example, rather than send out emails about checking for compiler warnings, why not set the build to treat warnings as errors? Then, nobody has to remember anything – generating a compiler warning will impede immediate progress and thus immediately be rectified. No emails, no post-it, and no entry in The List.

Alphabetized variables? Write a script on the build machine that parses your checked in files. Casing conventions? Write another parsing script or use something like DevExpress to automate it. Dependency management? Organize code into assemblies/libraries that make it impossible for developers to include the wrong namespaces or use the wrong classes.

And that’s the meat of the issue. Don’t tell people not to do bad things – make it impossible via immediate feedback. If I’m expected not to generate compiler warnings, make it so that I get the feedback of not being able to run my tests or application until I clean up my mess. That sort of scheme retains the learning and feedback paradigm, but without being a drain on productivity or the need for others to be involved. Constructor injection is based on this principle, when you really think about it. Using that injection scheme, I don’t need to wonder what I have to do before instantiating a Foo – the compiler tells me by not building if I don’t give Foo what it wants. I don’t have to go ask anyone. I don’t have to find out at a code review. I can just see it for myself and learn on my own.

Obviously, it’s not possible to make all bad things impossible. If you can do that, you probably have better things to do than code, like running the world. But, I would advocate that we go after the low-hanging fruit. This is really just an incarnation of the fail early concept, but that doesn’t just describe code behavior at runtime. It also can describe the feedback loop for development standards and preferences. Static analysis tools, auto-checkers, rules engines, etc are our friends. They automate code checking so that it need not be done by asking, telling, demanding, cajoling, threatening, begging, shaming, etc.

By

What’s In A Name? Probably the Quality of Your Code.

Name Smells Revisited

A long time ago, I blogged about a concept I dubbed “name smells”. Among other things that I described making me wary of a piece of code just by its name were the words “Helper” and “Manager” in a class name. My issue with this was and is that this tends to be indicative of a class badly coupled to whatever it’s “managing” or “helping”.

Today, I was adding some code to a large, long-standing code base, and I ran the Microsoft code metrics on the large assembly to which I was adding it. I then went looking for my class to see its metrics, for reporting purposes in an implementation document I was constructing. Out of curiosity, I decided to sort the classes by Maintainability Index to see how mine stacked up. I was pleased to see that it had a decent enough index of 88 and was relatively near the top.

From there, I became curious to see which classes were near the bottom, so I scrolled up, since they were in ascending order. As I got up into the 60’s, 50’s, and 40’s, I noticed an unmistakable trend. The worse the metric level, the more frequently I was seeing classes with names like “FooHelper”, “BarManager”, “BazHandler” and “ReBarController” (not to be confused with Model-View-Controller controllers, these ‘controllers’ were not implementing a pattern of any kind). When I scrolled back to the 70+ range, there were exactly 0 classes with names like these. My gut take had been somewhat empirically reinforced, if not outright validated.

Blame the Name?

So, what gives? Does Microsoft look for keywords like those and dock them 20 maintainability points or something? I opened a few of them up for a look and beheld gigantic methods, lots of state flags, and all of the things you might expect in a class with a poor maintainability score. So, Microsoft does not, apparently, share my intrinsic bias against classes with names like these. It must be something else.

Now, obviously I think there’s a correlation between hard-to-maintain classes and these names, or I wouldn’t have blogged about it. But still, you’d think in a pretty large assembly, there would have been one that bucked the trend. The exception to prove the rule….? But no, not a single one.

I spent a bit of time pondering this, glancing back at my old post. When I wrote that, I was thinking that generally what you see is some class evolving to gargantuan proportions, at which time the developer plays Solomon and cleaves the class awkwardly in two with one half of the class being the “Manager” and the other the “Managee”. (I admittedly don’t know exactly how these things get released into the wild, since I don’t do it, myself). But, I don’t think that’s always the case. I think that maybe, sometimes, they start out innocently.

If they start out the result of Solomon the Developer, they’re hosed from the beginning. But, if they start out simply, they must rot. I figure that some of the ones I was looking at must have started out innocently, as they seemed to have the air of a small, functional house with lots of haphazard additions built by someone’s pitied brother in law the contractor. So, what is it that dooms them all, regardless of origin or conception?

Yes! Blame the Name!

What they all have in common is their names. Generally, it’s common practice to give classes noun names and methods the name of noun-verb pairs (with optional modifiers). So, you have a “Customer” class with a “string GetFullName()” method. Service oriented classes tend to have noun names that are or contain deverbal nouns, such as CustomerValidityEvaluator or EventAggregator…. or, CustomerHelper. We’ll get back to this in a moment.

Humans are habitual organizers, and programmers tend to be moreso. We like to categorize, arrange, place in buckets, and otherwise assign hierarchy. But, we’re also naturally lazy. If we have a framework in place for organization, we’re happy enough to behave incrementally. You get and keep this:

If, on the other hand, no framework exists, the Broken Windows Theory goes into effect, and you get and keep this:

Now, whether people pay lip service to Big Up Front Design (BUFD) or not, most people actually practice some kind of emergent design in their code or another. You don’t necessarily know exactly what every class is going to do when you start, so some exploratory coding is probably done on the side, even in the most Waterfall-committed shop. At some point, each class starts with a (relatively) blank slate, and there is some uncertainty as to what this class is going to do, exactly. TDD lets you drive this with correct functionality as a client. BUFD dictates to you everything that it should do, but you still deviate because BUFD is about as practical as an ice cream grill. No process at all renders the slate totally blank.

As the coding phase wears on, more functionality will be added to this class, and it might stay neat and organized, or it might start to look increasingly like the second image. And, I submit that a big determining factor in whether or not that happens is the name of the class. A class with a name like “CustomerNameTruncator” is likely to repel random functionality like a rain-exed windshield because people are hesitant to do willfully wrong things in code. If you’re weighing your options as to where to place a method that reads a customer table in the database, you’re probably going to avoid CustomerNameTruncator because if anyone sees your name on that, they’ll think that you’re daft.

Well, you don’t want that, so you get clever. You ‘refactor’ the “CustomerNameTruncator” to be called “CustomerHelper” and throw your database method in there. Problem solved! Nobody can tell you that it doesn’t belong there because your method is certainly helpful. And now, we have a design pattern. Anything that might be helpful to customers goes in here. The class can truncate the customer’s name, and it can read the customer table, and, even send emails to customers informing them about new products that they might enjoy. I mean, why not? That’s helpful, and with the database method in there, it already knows their email addresses. Heck, it might be even more helpful if it read the new offerings from the offerings table, too. This class just keeps getting more and more helpful.

And, as it gets more and more helpful, its maintainability index plummets, and it rots. It does everything under the sun, and it gets coupled to more and more unrelated, unfocused things. The person that changes its name (or names it that from the get-go) has broken the first window in the building, and every other developer that stops by thinks it might be sinfully satisfying to toss a rock through another window.

That is the problem with this naming scheme. It’s so vague that whatever functionality you add to it pretty much always belongs there, at least according to its name. Good OOP does not allow this practice in your design, but the name begs to differ. It pours you a stiff drink and invites you to throw your dirty clothes on the floor and worry about it later.

As an exercise, if you see a class named in this fashion, open it up and try to describe its behavior in a simple sentence. I don’t even need to open “CustomerNameTruncator” to do that. But, how do I describe “CustomerHelper”? I bet it’s a run-on sentence with a lot of “ands”, and probably a number of “ifs” and “buts” as well, for good measure. When you’re describing a class’s function and you’re saying “and” a lot, there’s a good chance that class has too many responsibilities.

Well, The Windows are Already Broken, so Whatcha Gonna Do?

The fact that you’ve got Helpers and Managers liberally sprinkled throughout your code base is not an unsolvable problem. In the formulation of the Broken Window Theory, the important realization was that it was time to fix all of the windows and keep them fixed. Same thing applies here.

The very first step is to change the name of those classes. Don’t put it off because you don’t have time for a refactoring or be intimidated. Just change the name. If it has a bunch of responsibilities, pick one. Turn “CustomerHelper” back into “CustomerNameTruncator” and let the database and email methods look ridiculous. Your path to refactoring will be obvious.

Also, realize that it always looked ridiculous to the eye of a critical OO designer. Naming it helper won’t fool anyone who actually looks at the class, or any objective calculator of maintainability. The fact that you can argue that the class’s methods don’t disagree with its name doesn’t alter another fact — the class is a hulking, 4000-line behemoth with no clear single purpose. You don’t get points for coming up with a name so vague as to be meaningless, anymore than someone would for putting all code in a giant method called “DoWhatMustBeDone()”.

And then, take the experience with you and pass it on. It isn’t just “Helpers” and “Managers” that are the problem – it’s any class or method with a name vague enough that it could do anything imaginable and still be considered logically consistent. Keep an eye out for that in your code or in that of others and steer empty rooms towards being well organized as they fill up, rather than messy.