DaedTech

Stories about Software

By

Developer Process Gerrymandering

Gaming the System

As projects get a little behind schedule or perhaps a little contentious in terms of scope, I’d say it’s common for people to start taking steps to indemnify themselves against blame. I’d also say that this is fairly natural. There’s nothing more likely to attract wagging management fingers than a project behind schedule and/or over budget, so it makes sense to do a little rehearsing as to what one is going to tell them. Furthermore, it’s somewhat natural to try to showcase that one’s own efforts were a general positive even against a negative backdrop.

This is reminiscent of the “plus/minus” tracking in sports such as basketball, where it’s possible for a team to outscore its opponent when player X is on the floor, but still lose. The most common reason for this is that player X is a star, but the rest of the team is so bad that even his or her stellar play cannot overcome the shortcomings of the rest of the team. Long story short is that every developer wants to be that player X on a winning team if possible. But, absent that, being player X on a losing team is fine, so long as management and peers notice.

In such a situation (potentially “losing effort”), I’ve recently observed the following developer behaviors:

  1. Obvious (to me, if not management) estimate sandbagging.
  2. Wheedling, begging, berating, manipulating and generally badgering defect reporters into retracting reported defects.
  3. Refusal to fix obvious and embarrassing problems with the software because said solutions weren’t “in the requirements statement”.

These behaviors aren’t particularly difficult to understand in the context of seeking to create the impression of a better “plus/minus” score for a developer. Consider the following explanations:

  1. Estimate sandbagging allows a developer to finish way ahead of schedule, creating a heroic sort of aura for a time
  2. Improving code quality is one way to reduce the number of defects reported against one’s code. Browbeating people into not reporting the defects is another way to do this and more expedient in the short term, in a sense.
  3. Refusing to address something on the grounds that “I never had a requirement for this” is a nice two-fer; it creates the impression that not addressing the shortcoming was a conscious decision rather than an oversight and it also identifies a different blame target (whoever is responsible for requirements).

This frank categorization may induce a chuckle or two, but I doubt I’m breaking any drastically new ground for people who work in programming shops where this kind of thing probably occurs with frequency somewhere between “here and there” and “constant”. I do think there is value in coining some terms around it though so that we can examine the issues it causes, its own root cause, and some potential solutions.

Elbridge Gerry, An Inspiration

In the early 1800’s Massachusetts had a governor by the name of Elbridge Gerry. A Democratic-Republican by party affiliation, Gerry saw trouble in the tea leaves for his own party in the state senate and decided to do something about it. You might think that he decided to take his case to the people, touting the benefits of his party and convincing them to vote for Democratic-Republicans. You’d be wrong though.

Convincing the public to like you is hard and time consuming and there’s no guarantee of success. Gerry had a more expedient idea. He simply signed a law that radically altered the voting districts in his state in such a way that a majority would result in the state senate without actually having the voters to support it. One of the newly created districts was so contorted and weirdly shaped that political opponents said it looked like salamander. Gerry’s Salamander became the term “Gerrymander” and the phrase stuck around. Today it retains its definition as a sleazy but legal practice in which politicians stack the deck in favor of their own party as they’re dealing out political cards to the voters.

Getting back to the topic of software development, the behaviors I mentioned in the first section and their attendant explanations are all examples of a behavior similar in intentions to gerrymandering. In our world, the ideal for a software group is obviously to write good code, provide good estimates, and deliver quality software on time for a good value. As an individual the ideal is to have a good “plus/minus” within the group toward that same end — to write better than average code, delivered more quickly than average for less money than average. That’s all noble, but it’s also hard. And so some developers prefer to emulate Gerry and change the rules of the game (distort time, change definitions, employ technicalities) rather than playing it well. In the “plus/minus” world, this is like lobbying to play against the third string of the other team to pad one’s statistics.

Eliminate the “Plus/Minus”, Eliminate the Gerrymandering

So, if you’re managing a development team, how do you eliminate the counter-productive gerrymandering behavior? Eliminate the individual “plus/minus” ratio mentality. Imagine that you have a software project and you tell everyone working on it that their performance and pay/bonus were going to be dictated by the ranking on a scale of 1-10 that users gave the software in a survey (encompassing usability, quantity/quality of features, etc). Let’s think about what happens to the three behaviors at the start of the post:

  1. Estimate sandbagging is pointless because coming in ahead of a schedule that you’ve defined has no bearing on your bonus and performance review.
  2. Badgering defect reporters makes no sense because the goal is fewer defects in the user’s hands rather than fewer defects reported against the individual.
  3. Refusal to fix problems (finger pointing) is also silly because whining that someone else is at fault is cold comfort in the face of a bad review and no bonus

Now, I don’t necessarily believe that there will be entirely smooth sailing in this or any other construct. But, eliminating the focus on individual incentives and performance has some nice side effects including the one on which we’re focused: removing the tendency to try to game the system without adding value. (Another nice side effect is that you tend to downplay the unfortunate “rockstar” designation that panders to megalomaniacal personality types who tend drastically to overestimate their own irreplaceability and scare others into the same).

But How to Measure Individual Performance

Still, the question of individual performance evaluation remains. It’s all well and good to reward or punish the whole team as a unit, but there are HR org charts, career maps and other individual concerns to think of. So how do you address this while still avoiding the gerrymandering?

It’s all about the code and source control. There are a lot of metrics out there for evaluating developer productivity from the obtuse (counting lines of code generated or altered) to the anecdotal (does this guy seem to do a good job), but I’m not really aware of one that captures the truth as well as looking at someone’s changesets/commits in the context of a code base. So, what if you had some impartial and extremely knowledgeable party read through a code base, looking at its architecture and tendencies, and then read through the change sets to give each developer some kind of rating or at least partial rating.

It seems almost as if you could create a cottage industry out of this and offer it as a consulting service. This way, the consultant is truly removed from any office politics and can focus entirely on the code. If one did this as a craft, I’d imagine that the addition of data points to the general corpus would cause the process to become quite refined over the course of time and probably generate some relatively objective, interesting and meaningful metrics.

Of course, this wouldn’t be perfect. It’s only as good as the evaluator is sharp, and it has the shortcoming of not capturing “intangibles” on a project — maybe someone wrote little code but spent a lot of time helping other people with source control and other technical issues, for instance. Code and changesets are mute witnesses to the story of the project, but they don’t witness everything.

Still, I think the notion is an interesting enough one that it bears exploring. Developer (or any team member) posturing and gerrymandering are counter-productive activities that put the developer’s interests above the projects but, more problematically, misalign those two sets of interests. The combination of team effort and metric individual evaluation could help prevent that sort of thing.

By

Abstractions Are Important 5 – Type Consistency

For the fifth post in this series, I’m going to start with a mini rant.

A Digression (Rant) About Enum

Over the last couple of years, enumerations/enums have been dying a slow death in the world of code that I write. I wasn’t every really an avid user of them, but they’ve definitely been declining even for me to the point of virtual-non existence. I’m not sure exactly what it is about them or about me that’s spurring this, but I’m not sorry about it at all. I don’t miss them.

I think perhaps the motivation has been an increasing desire to use polymorphism at all levels, even when the object I’m making doesn’t seem “classworthy”. That is, why would I create a “CardSuit” enum when I can just create a first class type for Suit? I think perhaps another factor in this approach of mine is that enums tend to go hand-in-hand with switches and I consider this pairing to be an anti-pattern. Even with an enum like “Suit” that is really just representing mutual exclusion, this is inevitable somewhere:

switch(mySuit)
{
    case Suit.Spades:
        return "Spades";
    case Suit.Diamonds: 
        return "Diamonds";
//etc
}

And, why do that when I could just have something like this:

public abstract class Suit
{
    public abstract string LongSuitName { get; set; }
//etc, as Suit needs additional behaviors that would otherwise go in switches
}

This way, I can later plop suit specific designations into inheritors to my heart’s content without hunting for switch statements littered throughout the code. (Also, for language agnostic readers, C# enums are a different beast than their Java counterparts — in C#, they’re really just glorified collections of constants).

Apparently, I’m not alone in this sentiment. Just to stir things up, I googled “C# enums are evil” and came up with this interesting link from stackoverflow guru Jon Skeet:

Since working on a Java project last year, I’ve been increasingly fed up with C#’s enums. They’re really not very object oriented: they’re not type-safe (you can cast from one enum to another via a cast to their common underlying type), they don’t allow any behaviour to be specified, etc. They’re just named constant integral values. Until I played with Java 1.5’s enum support, that wouldn’t have struck me as being a problem, but (at least in some cases) enums can give you so much more.

It’s good to see that the man who defines “10” when a recruiter or someone asks you to rate yourself 1-10 on strength in a language feels the same way as me. And, I think he really nails it with the non object-oriented comment. Enums make me feel like I’m writing kernel code in C or something when I use them.

If You’re Going to Use Them…

All that said, it’s not as if they’re going to be excised from the language tomorrow — we might as well make sure they’re used in a way that makes sense if and when they are used. In a code base that I’m in from time to time (and I’m obfuscating the problem domain a bit, but leaving the intent and meaning intact), the concept of “side” exists in the sense that anything in the domain must be left or right. Think of it as though we’re shopping for a pair of shoes. Here is how “side” is represented:

public enum Side
{
    None,
    Left,
    Right,
    Both
}

At first blush, this makes sense. We might have no shoes on, the left shoe, the right shoe, or both shoes. But, ask yourself this: “what is ‘side’ and what is it made of?” Left and right are somewhat mundane, but what about none and both? Is “none” a side? Is “both”? Do I put my “none” shoe on before the left shoe and right shoe, at which time I put on the “both” shoe? Can you infer the usage of this thing from looking at it? No, you really can’t…

And the reason you can’t infer the usage is that the enum consists of two sides and two expressions of quantity of sides. This enum is a chameleon — depending on where you’re standing and what part of it you’re looking at, it can be two different things. And, I submit that this is bad — if you’re going to use enums at all, use them to represent simple, mutually exclusive concepts (like the aforementioned “Suit” in the problem domain of playing cards).

What’s the Harm?

Let’s take a look at an example (stripped down for brevity) client of this enumeration:

public class Shoe
{

}

public class EnumClient
{
    private readonly Dictionary _shoes = new Dictionary();

    public void PutOnShoes(Side side)
    {
        if (side == Side.None)
            return;
        if (side == Side.Both)
        {
            PutOnShoes(Side.Left);
            PutOnShoes(Side.Right);
        }

        _shoes[side] = new Shoe();
    }
}

What we’ve got here is kind of cute and clever. If I want the client to put a shoe on, I pass in the side. But, if I want to tell it to put both on, then I can just pass in Side.Both, and the method knows how to handle this. Sweet! The only immediate drawback is the fact that we have to handle Side.None. Clearly that should be a no-op that clients should avoid, but it’s just as valid as any of the other things from the compiler’s perspective, so we manually no-op.

But, is this cute bit of recursion actually as sweet as it seems? What if we write another method like this called “TakeShoesOff”? What if we write a class called ShoeSalesman that retrieves pairs of shoes? We almost always want him retrieving both shoes for clients, so we’re probably going to want to perpetuate this pattern into his methods as well, probably by copy and paste to save time. How about a ShoeStoreCashier ringing up pairs of shoes? We can take care of that with our good buddy copy and paste too. There’s no method about shoes that we can’t handle that way!

But, wait a second? Isn’t this starting to be kind of a code smell? If this implementation is so sweet, why does it seem all wet in the face of DRY? If you have actually done this in hundreds or thousands of places, you should be getting a sinking feeling in your stomach right about now. That feeling is the feeling that your code is suddenly and subtly incredibly brittle. This cute little encoding over the enum is actually an algorithm that is everywhere in your code.

Let’s say that I want to get in on some cuteness myself. What I’d like to do is write a method and say, “I don’t care which side you pass me, so long as a side exists — I’ll just perform an operation on the first side that I have available, if any.”

public enum Side
{
    None,
    Left,
    Right,
    Both,
    Either
}

public class NewEnumClient
{
    private readonly Dictionary _shoes = new Dictionary();

    public void RemoveShoes(Side side)
    {
        if (side == Side.None)
            return;
        if (side == Side.Both)
        {
            RemoveShoes(Side.Left);
            RemoveShoes(Side.Right);
        }
        if (side == Side.Either)
            RemoveShoes(_shoes.First().Key);

        _shoes.Remove(side);
    }
}

I’ve now extended the cuteness to be more “flexible”, and I’m pretty pleased with myself, so I go ahead and deliver this code. No unit tests break, no problems emerge that I can immediately see, so life is good. But then, weird problems start to crop up after a while. People file bug reports saying that they add shoes and see nothing on the screen or that they remove shoes but nothing is deleted from the database. It’s kind of a mystery at first, but I’m forced to get to the bottom of this as the trickle of bug reports starts becoming an avalanche.

What’s going on?!? Well, what’s going on is that all of the Cute 1.0 code can’t handle the new enum I’ve defined for Cute 2.0. So, what does it do? Well, sometimes it skips it altogether and no ops. Sometimes it adds a new key to the dictionary — a key for which it never checks. Sometimes it throws some kind of exception where it falls into a “default” state in a switch statement someone has defined. Sometimes it throws up an error message box informing the user, “This should never happen — email Erik!” for the same reason, which is doubly bad since I’m probably going to be featured on the Daily WTF.

Wow, bummer. But, no big deal. I’ll just roll up my sleeves and upgrade Cute 1.0 to Cute 2.0 everywhere. How many can there be, right? Uh oh. Hope I’m not doing anything this weekend. For everything copy and paste programming lacks in being a good idea, it makes up for in its ability to spread through a code base like kudzu. It’s too late to revert Cute 2.0 since I now have clients that depend on it, so I’d better roll up my sleeves and get pastin’ because it’s going to be a long Saturday with some Mountain Dew and 7,400 methods that I need to change.

The Real Problem

It should be fairly obvious that any “pattern” requiring you to add 5 or 10 lines to the beginning of a bunch of methods is actually an anti-pattern, particularly if those lines are similar or identical. And some might stop here and say that getting cute in the first place here was the problem, but I contend that this is just a symptom. To tie this back in with the series, the real problem is one of abstractions.

As I asked earlier, what is “side”? Really, can you tell me? I mean if I have a “Customer” object in my domain, you’d probably say something like “that models a customer of the enterprise for which this application was written” or else maybe you’d just smack me upside the head for asking such an obtuse question. But for “Side” in Cute 1.0? Without using the word “side” in your definition? You might say “well, it represents where we can put a shoe” or “it represents the directions left and right”, and you’d be correct for two of the enums values. You might say “well, it represents a number of places that you can put a shoe” or “its a flag that you need to use to tell your methods how to behave” and you’d be right for the other two values. Hmmm….

I know! It’s a doo-dad you can use to index a hash! That’s probably the most accurate way to describe it because that is unfortunately true of all 4 values. Unfortunate because for two of the values, it makes no absolutely sense to do that — but at least it’s true, so we’re getting somewhere. At the end of the day, though, I think all you can really say of Side is “it’s an enum and it’s up to clients to figure out what to do with it.” And that, my friends, makes it a bad abstraction.

Here are some other enums that would be poor abstractions

public enum Stuff
{
    Grape,
    Twelve,
    Microwave,
    Restart,
}

public enum CardSuit
{
    Spades,
    Hearts,
    Diamonds,
    Clubs,
    Black,
    Red,
    NoCard
}

public enum Dance
{
    Foxtrot,
    Tango,
    Waltz,
    DoesntLikeToDance,
    Miscellaneous
}

If you’re going to use enums, they ought to be values that are mutually exclusive, constitute a complete set, and form a clear abstraction. The first one is obviously nonsense, but the second two are enums where the person writing/extending them is about to get cute. They’re about to take an enum that has a set of mutually exclusive values and reappropriate it to use as a flag to tell their client code how to behave. Think of the code that’s about to be written — things like “if the suit is black, then do it for spades and clubs” and “if the person passed in has favorite dance property set to doesn’t like to dance…”

If you find yourself doing these kinds of cute things ask yourself what you’re really trying to accomplish. In the suit case, wouldn’t it make more sense to parameterize a method so that you could pass in multiple suits for the client code to operate on, instead of hard-coding the iteration? In the dance case, maybe it’s time for dance to be a first class object so that a “FavoriteDance” property can be set to null or a null object.

In C#, enums are already pretty much screaming “hey, there’s something called polymorphism — use that instead of me!” Once you start adding cute, one-off flag encodings to them, you’re really dropping all pretense of enumeration being a suitable abstraction and going for the gusto with fake, procedural polymorphism forced on your clients. Please, for your sake and mine if we later work together, don’t do that. The world doesn’t have a “class reserve” ala oil that may be exhausted someday — make a class. You won’t regret it because you won’t be spending your weekends upgrading The Cute.

By

TDD Prevents Copy and Paste Programming

Copy, Paste, Fail, Repeat

Today, I was reviewing some code, and I saw a “before” that looked like this:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

The first thing that stuck out to me was that I really wanted to stick “var” in there a couple of times because I read the word “classifier” so many time it lost all meaning. The second thing that struck me was the after:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

private void RefreshClassifierEnabledStatus(bool value)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Enabled = value;
    }
}

If you’d like to play a game of “find the compiler warning”, by all means, go ahead, but I warn you — the next sentence is a spoiler. In the new code, “i” is never read anywhere, which will generate a warning about unused variables. Annoying, but not the end of the world, and a fairly quick fix.

But, how did the code get like this? I’ve isolated the change in a way that should make this fairly obvious if you think about it for a minute or two. Basically, someone took the first method, copy and pasted it, and modified the payload of the foreach to suit his or her needs. Of course, the new need didn’t involve reading the local index variable, so oops.

I talked with the developer whose code change it was, doing my due diligence in pointing out that this is a classic example of one of the many problems with copy/paste programming. I tried to think back to a time when I’d been burned by this sort of thing recently, when it dawned on me that this simply wouldn’t happen to me. I say that not because I’m some kind of purist that disables ctrl-V and never pastes any code anywhere (though I do keep this to a pretty strict minimum), but rather because this is simply a non-starter for someone who practices TDD.

TDD Is Full of Win

If I were going to write the method RefreshClassifierStatus, the first test that I would write is that Classifier[0].Enabled was false when I passed in false. That would go red, and then I’d write some obtuse code that set Classifier[0].Enabled to false. Next, I’d write a test that said it was true when I passed in true and, after failing, I’d set that property to the passed in value. Finally, I’d write a test that Classifier[Classifier.Count – 1].Enabled was true when true was passed in and suddenly I’d have a foreach loop executing properly.

As an aside, at this point, I’d set the list to null and write a test for how the method should behave when that happens. This seems not to have been considered in this code, which is another problem with copy/paste programming — it’s a multiplier for existing mistakes.

So where in this, exactly, do I introduce an unused variable? That’s never going to help me get any test to go green. Where in this do I ever copy and paste the method wholesale? That’s not the simplest thing that I can do. So, with these two guiding heuristics, followed rather strictly, it’s simply impossible for me to introduce needless noise into the code. TDD forces an impressive level of efficiency most of the time.

TDD is Full of Fast

Had I written this method, I would have done it in probably 5 minutes or so (and that’s a fairly pessimistic estimate, I think), generating 4 unit tests for my trouble. The person that wrote this, I’m guessing, spent 30 seconds or less doing it. So, I’m behind by 4.5 minutes. But, let’s add to that the time that this developer will spend after seeing the code review firing back up the development environment, navigating to the code, removing the line of code, re-compiling, and re-running to make sure that it’s still okay. I’m now only maybe 2 minutes behind.

Now, let’s also factor in that I just realized that problem about the null check, which I’m going to email over and suggest be fixed in the new and old methods. Now there are two test cases instead of 1, and I’m suddenly ahead in terms of development time.

Okay, so the time here is a bit contrived and it’s not as though mistakes can’t be made with TDD, but I can say that a lot fewer of them are. The point here is that copy/paste programming is supposed to be super fast. It’s what you do when there’s no time for good software development practice because you need to hurry. And yet, it’s really not that fast in the grand scheme of things. It just makes you busier and wrong more efficiently than being wrong by hand.

So, save yourself time, effort, and the headache of getting a reputation for sloppy work. Slow down and do it right. TDD isn’t required for this, but it sure helps.

By

Abstractions are Important 4 – Personal Interactions

An Announcement

I’ve enlisted the help of freelance editor Amanda Muledy to copy edit this blog. The content will not be altered in terms of substance; rather, she’ll be correcting grammar and spelling mistakes and the like. She’ll even know whether or not the semi-colon I just used was some kind of faux pas.

She has started on my oldest posts and will be working her way toward the newest ones and we’ll eventually probably work to a state where I send them to her prior to publication. So, if you see mild edits to content here and there, that’s the reason for it. If you’re interested in her services or more information along these lines, she has contact info on her blog.

And now back to your regularly schedule post…

Beyond the Code

In the second post in this series, I described in detail an abstraction that had nothing to do with programming: the pizza parlor with delivery service. Today I’d like to revisit abstractions in the real world, but not focus on static institutions. Instead, I’d like to talk about the ad-hoc abstractions that we form in one another’s lives as we work collaboratively.

Huh?

When we work with other people, what we generally do is offer abstractions to one another for the purpose of getting things done more efficiently. Have you ever said something like “you prep the chicken, and I’ll start the grill”? This is an ad-hoc, process-based abstraction. What does prepping the chicken entail? Maybe a dry drub, chopping it into chunks, etc. But, the interesting thing is that the person starting the grill doesn’t care at all, as long as it’s taken care of. Likewise with the starting of the grill. Is it a charcoal grill? What kind of charcoal to use? None of that matters to the person prepping the chicken.

This is an example of a good process abstraction, and these types of abstractions are the basis for the concept of effective teamwork. They allow people to work in parallel and to be more efficient than they would be working on their own, assuming that the tasks are not temporally dependent. Sure, one person could prep the chicken and also start the grill, but two people doing it simultaneously will go faster. It would be a different story if the two tasks were interdependent or mutually exclusive. “Hey, you clean the bottom of the grill while I start it” wouldn’t work very well.

So efficient teamwork requires tasks that are relatively orthogonal and participants providing abstractions to one another. (It is certainly possible to have abstraction in sequential tasks, but that’s not nearly as interesting). Without orthogonality, the team members would step on one another’s toes while completing their tasks, and without abstractions there wouldn’t be any productivity boost.

For Example

Last week, I wanted to order a server and confirm that our MSDN licensing covered the server software that I wanted to install. Management paired me with someone whose job it was to worry about these things so that I could focus on my job of being a programmer. Things started to go a bit off the rail at this point, though. I got an email about computer wholesalers and the names of salespeople at these wholesalers, distrust of Microsoft and its licensing and any number of other things that weren’t of interest to me — things that should have been abstracted away from me. My goal was simple — get a computer and install server software on that computer.

I responded to this email by saying that I appreciated all of the information, but I didn’t really want to worry about all that stuff. I just wanted the computer. I did ask if it made sense for me to call one of these wholesaler employees directly, which I would have been happy to do, since this would have provided me with an actual abstraction. The response to this email was another torrent of information that was tangentially related at best (overseas latency issues, server hard drive configuration snags, etc) , but no servers ordered and no contact information provided. This went on for several emails more emails in the same fashion (the situation is now on hold for other reasons, so there is no resolution, either).

What I had was a consistent failure to provide an abstraction, and thus I had no actual help with anything. If I have to stop working at my assignment (being a programmer) to deal with an IT assignment (ordering a server) than the person whose responsibility is the latter is providing no abstraction and thus, functionally useless to me. In this example, the time I spend reading and responding to emails could better have been spent calling Microsoft and/or the wholesaler directly. Either way, I couldn’t focus on my job, so at least something productive could have come out of it.

Why Does This Happen?

As you read through this, you’re probably both asking yourself why anyone would do this and also thinking of people who have provided a bad abstraction to you in the past. That is to say, it’s often incomprehensible to a problem solver not really mired in office politics, but it’s also a common behavior. I posit that there are three reasons someone will provide a bad ad-hoc personal abstraction:

  1. Passive-aggressive desire to sabotage an initiative
  2. Earnest over-exuberance or over-inquisitiveness
  3. Simple incompetence

The last of these is the simplest and least interesting. If someone is simply incompetent at a task, they will serve poorly as an abstraction. If you were to ask me to ask someone for the time of day in Mandarin, I would respond with either “how do you ask someone the time of day in Mandarin?” or “will you do it for me?”. I am simply ill-suited for this task since I don’t speak that language.

The second of these is what results from people who are nervous about responsibility or perhaps overly excited about it, but are willing to put in a good faith effort. Perhaps you ask me to cook you an omelette, and I respond by saying, “how much pepper would you like in it; do you care if the eggs are whisked for 20 seconds or do you want 30; these are those brown eggs – is that alright; do you want me to use PAM or oil; etc?” I mean well, but I’m defeating the purpose of what you asked of me. By bogging you down in these details, I’m forcing you to spend more time helping me than was your intent and I might well be annoying you. This sort of behavior might result from me really wanting you to enjoy the omelette or me being afraid that you’ll yell at me if I mess something up, but there’s no ill intent.

The first item in the list is probably the most interesting from a realpolitik perspective in a collaborative setting, though it isn’t because it’s a sophisticated technique. Children learn to do this at an early age and employ it without any subtlety. Ask a child to clean his room and, after some initial whining and the gears start turning:

Child: Where does this go?
Parent: In the drawer over there.
Child: But I can’t reach!
Parent: Get a stool.
Child: Where are the stools?
Parent: There’s one on the other side of your bed.
Child: It’s heavy!

The goal here is to feign simple incompetence to get out of the task, which is a rather natural thing for a child to do since parent-child relationship often doubles as teacher-student. The child banks on the parent becoming so exasperated that he or she finally exclaims “Oh, for the love of… I’ll just do it!” This gets interesting in adult interaction dynamics since it necessarily involves a sacrifice of perceived status. Whoever does this must be willing to accept a role of subordinate status and often to move down the path toward learned helplessness. If you work with me, and I respond to all of your requests by saying that I don’t know how to do things, it’s only a matter of time before you and others begin to question my competence and value in my role.

As a result, the best way to get out of work (or sabotage an initiative in general if the motivation is something other than laziness) is not to offer oneself sacrificially as a buffoon, but instead to exaggerate the difficulty of the task with which one is presented:

Oh, sure, I’ll make you an omelette, but there’s really a lot to think about here. You have to consider what kind of pan you want, what size and color of egg, the quanity of pepper involved… I mean, there are a lot of variables here. Don’t worry, though — I’ll make some inquiries as to what the best kind of ingredients are and get back to you tomorrow.

Notice that it’s very similar to the over-exuberant explanation, but with a bit more of an air of authority. It also subtly implies that the earliest possible completion of the task renders it not worth doing — who wants to wait 24 hours for their breakfast? This shift from excited questioning to authoritative is often a giveaway, but sometimes these motivations may even come off as identical. They’ll certainly be similar and it would be a shame to accuse someone of some sort of malfeasance when the explanation is benign. It’s genuinely hard to distinguish items (1) and (2) in someone else in a vacuum without track records and understanding of possible ulterior motives.

Avoiding All Of It

And, most diplomatic people will avoid this sort of accusation because it is rude and assuming. They’ll act on the surface as though you’re simply excitable. But, here’s the catch — they’ll assume that you’re passive-aggressive if it happens too often and if it’s a regular source of impediment. So really, this type of behavior, even when benign has at best a short term benefit with a long term cost. People will probably assume that you deliberately make yourself an obstacle, but even if you’re fortunate enough to avoid that, they will assume that you are incompetent and/or annoying.

Do you really want any of those labels? I don’t, myself. So, I’d stress that you make sure to be a good abstraction for others when asked. Make sure that your work makes things simpler for other people and removes obstacles from their path, rather than adding them. And, if you really are ill-suited for a task, explain why and try to think of something you can offer instead to be helpful. It really does parallel code in a lot of ways — you should hide complexity from others and present them with simple but meaningful choices.

Good abstractions in code, good abstractions in life — a good way to live.

By

Beware the Bloated Constructor

What’s a Bloated Constructor?

Yesterday, I was going through the version history of a file in some code base (from earliest to most recent) and I saw the following:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    Side = es;

    try
    {
        if (product == null)
            throw new Exception("Can't create an active product from a null product");

        Logger.LogTimeStudyStart("ActiveProduct create");

        if (driver.IsEmpty())
            CurrentDriver = product.Driver;
        else
            CurrentDriver = driver;

        _Device = Session.Instance.DeviceManager.CreateHI(es, CurrentDriver.ToString());
        _Device.ConnectionStatus += Device_ConnectionStatus;
        _Device.BatteryStatus += Device_BatteryStatus;
    }
    catch (Exception ex)
    {
        ExceptionHandler.HandleException(ex);
        throw ex;
    }
    finally
    {
        Logger.LogTimeStudyEnd("ActiveProduct create", "ActiveProduct: Creating " + driver);
    }

    State = ActiveProductState.Simulated;
    Environments = new SortedList();
    this._Product = product;
    _Options = product.AvailableOptions;
    AvailablePrograms = new ProgramSlots(this) { HIStartIndex = 0 };
    AccessoryPrograms = new ProgramSlots(this) { CanRemoveOverride = true, StartIndex = 4, IsAutoShifting = false };
    VirtualPrograms = new ProgramSlots(this) { IsVirtual = true, CanRemoveOverride = true, SlotConfig = Limit.None };
    SlotCalculator = new ProgramCalculator(this, false);
    VirtualCalculator = new ProgramCalculator(this, true);
    DFS = new DFSCalibration(this);
    Accessories = new Accessories();
    PowerBands = PowerBands.Unknown;
}

This made me sad and tired but no worries, I figured, there were a lot more revisions I hadn’t yet seen. Surely somebody did something about this later. And then, happiness:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    Initialize(product, driver, es);
}

Sweet! That’s much easier on the eyes. Let’s see just see what Initialize does and call it a day:

private void Initialize(Product product, StringVersion driver, Side es)
{
    Side = es;

    try
    {
        if (product == null)
            throw new Exception("Can't create an active product from a null product");

        Logger.LogTimeStudyStart("ActiveProduct create");

        if (driver.IsEmpty())
            CurrentDriver = product.Driver;
        else
            CurrentDriver = driver;

        _Device = Session.Instance.DeviceManager.CreateDevice(es, CurrentDriver.ToString());
        _Device.ConnectionStatus += Device_ConnectionStatus;
        _Device.BatteryStatus += Device_BatteryStatus;
    }
    catch (Exception ex)
    {
        ExceptionHandler.HandleException(ex);
        throw ex;
    }
    finally
    {
        Logger.LogTimeStudyEnd("ActiveProduct create", "ActiveProduct: Creating " + driver);
    }

    State = ActiveProductState.Simulated;
    Environments = new SortedList();
    this._Product = product;
    _Options = product.AvailableOptions;
    AvailablePrograms = new ProgramSlots(this) { DeviceStartIndex = 0 };
    AccessoryPrograms = new ProgramSlots(this) { CanRemoveOverride = true, HIStartIndex = 4, IsAutoShifting = false };
    VirtualPrograms = new ProgramSlots(this) { IsVirtual = true, CanRemoveOverride = true, SlotConfig = Limit.None };
    SlotCalculator = new ProgramCalculator(this, false);
    VirtualCalculator = new ProgramCalculator(this, true);
    DFS = new DFSCalibration(this);
    Accessories = new Accessories();
    PowerBands = PowerBands.Unknown;
}

Oh, the humanity… looks like someone sprayed a bit of cologne here and nothing more.

So, what’s the problem?

I contend that this is a fundamental design failure. Forget throwing an exception from within a try block as some kind of goto. Forget all of the static stuff like the logging and exception handling and general global state. Forget the singletons. Forget all of the other ancillary problems. The design failure is how complicated (convoluted) this constructor is. I see this enough that I think some people would call it a development pattern (I don’t think any would claim it rises to the level of “Design Pattern”), but I firmly believe that it is an anti-pattern.

This style of code is the product of procedural, command and control thinking that I blogged about some time back. It makes no distinction between creation/initialization of an object and operations of the object. These distinct activities are blurred as the bloated constructor generates logger messages, talks to singletons, instantiates other objects, etc.

From the perspective of the procedural programmer, the constructor should take the role of making sure the object wants for nothing, having all of its creature comforts and having its every desire and whim satisfied. In the object oriented style, where dependencies tend to be inverted, the constructor has a different and more Spartan role. Its only job is to make sure that object initializes into a state where it satisfies its basic invariants (in other words, it makes sure that the object instance starts off in a valid state and nothing more).

Let’s consider why the latter, object-oriented approach tends to be a better fit in object oriented languages (and, I would argue, in any state-based paradigm, but I’ll focus on OOP here).

1. Violating Developer Expectations

Imagine that you’re newly assigned to a code base and you want to start understanding how it works. An excellent way to do this is to write a few unit tests (perhaps to keep or perhaps just for throw-away experimentation). You pick some class that you’d like to test and figure you’ll experiment with its API before digging into its code. So, you create a new instance of that class and…

Oops. Your continuous testing tool crashes. Huh. Guess you’ll have to look in the code now (this is never a promising sign — if you have to inspect a class’s code to understand why it’s crashing, it’s probably not a good abstraction, especially if this is true of the constructor.) So, you look in the code and see that there’s a reference to a lazy-loading singleton that lazy-loads a few other singletons and one of those singletons is trying to read files off of a network somewhere and….

Alright, forget it. Let’s move onto another class. You instantiate that class and your testing tool doesn’t crash, but it does turn red. Huh. Back into the code. Ah, this time the class under test doesn’t touch any static stuff or singletons, but it instantiates another class that does. The exception handling is a little better here, so you just turn the testing tool red rather than completely crashing it, but this class is hosed for testing as well.

As a new developer, you’re instantiating classes and expecting that they’ll be instantiated. What’s happening instead is weird, bad, unpredictable things that force you to open the source code to figure out what on Earth is going on. You’re badly betraying a trust other developers put in you to do what you advertise (i.e. putting together the minimum invariants of the class).

2. Testability Nightmare

And, looking back at the narrative from the previous section, another problem makes itself pretty obvious as well. Constructors that do real work instantiating other things (especially global state) make classes impossible to test in isolation. Even if you don’t use global state and instead have classes all instantiate their collaborators, it’s still tough to test things in isolation.

Imagine that your code works like this: main tells a car to build itself, the car tells its engine and chassis to build themselves, the engine tells the alternator and transmission to build themselves, etc. Now, let’s say that you want to write a test of the car to see what happens when its engine starts out invalid. Just how will that be accomplished? You have no control over its engine — it controls that in its constructor. Even if it exposes engine via an accessor, too little, too late.

3. Casts Dependencies And Assumptions in Stone

If the constructor is busy, it’s likely that it’s talking to global state/and or instantiating things. If it takes arguments, it may be mutating those or paring them or coupling itself to them. But, whatever it’s doing, you have no control over it as a client. So, if you want this object at all, you’re getting it with whatever the constructor decides to foist on you.

Contrast this with any non-constructor method. Any non-constructor method you have the option to call or not, so you have the control. If you want a Car that isn’t started, you can simply not call Car.Start(). However, if the author of Car decides to start it in the constructor, you’re hosed. Not having a started car is not an option for you, buddy. Likewise, if Car instantiates a new SixCylinder() engine and you want a more fuel economical four cylinder, you’re SOL.

4. Performance Problems

Let’s say that I have some class, Critical, that I can’t live without in the application. I need to instantiate a Critical and access some property on it in response to a user request. Let’s say that Critical has a busy constructor that chugs and hums and does all manner of things, taking 5 or 10 seconds at times. Not all of this stuff has anything to do with the property that I want to access in response to the user request.

What are my choices here? Well, I don’t have any. I can like it or lump it. I need new Critical() and the user therefore needs to wait 5 or 10 seconds even though his request should take that number of microseconds. Sure, I can cache my Critical for next time or I can implement a flyweight, but that first time, I’m still needlessly forcing a wait on the user.

So, What Should a Constructor Look Like?

Given all of these arguments against bloated constructors, it should be fairly obvious what I think a constructor ought to look like. It ought to contain very little code. If the class has no dependencies, then it ought to contain no code (or not exist). After all, what are you going to do in the constructor that you can’t do in a field initializer? You’re not reaching into global state somewhere, are you?! Tsk tsk.

If a constructor has parameters, it ought to do nothing but set local fields according to those parameters, and perhaps perform some sort of validation on the parameters to ensure that its invariants can be satisfied with these values. That’s really it. The constructor’s only responsibility is to put the object into a valid state as far as its invariants are concerned, while doing the minimum amount of configuration possible. After all, you want objects where clients can explicitly tell them what to do. Even simple things like hooking to events should be triggered by a client method and not the constructor, ideally. It may seem a bit more cumbersome, but it’s also expressive and clear. Constructors hide side effects where well-named methods promote them from side effects to deliberate effects.

So What About that Constructor We Looked At?

How about this:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    if (product == null)
        throw new Exception("Can't create an active product from a null product");

    Side = es;
    CurrentDriver = driver;
    this._Product = product;
}

Wow, isn’t that like a ray of sunshine on an otherwise cloudy day? No Initialize() method hiding untold horrors. No nested, contradictory exception handling logic. No logging, no singletons, no static state — simple, minimal and easy to reason about. So, what to be done with all of these things that formerly resided in this constructor?

Well, a number of them were simply initializing properties or backing stores. This can be done with class initializers for sensible defaults and should otherwise be left to clients. After all, we’re here to serve them — not force settings on them. The logging makes no sense — do that somewhere else. If you’re writing a class and you want to know how long the constructor takes (which is a huge warning flag, by the way), write that code where you instantiate it — don’t force this on every conceivable client of your class.

The event wireup can be moved into some kind of “Hookup()” method to allow clients specifically to request this. You could also make it more granular if you like, allowing them to hook up ala carte instead of all at once. As for the singeltons and static state, leave that up to the instantiating client if it wants that done. We might as well leverage the thing that makes static state a design nightmare — the fact that anyone can do anything from anywhere — to move the stink away from us with some “Not in my backyard” attitude. If everyone does this, eventually the global state will be bubbled to some common location where singletons can be slain and static classes converted to instances.

Are You a Lone Crackpot, Erik?

As it turns out, no. Here is what others have to say on the subject:

Microsoft agrees with this take, citing performance:

Do minimal work in the constructor. Constructors should not do much work other than to capture the constructor parameters. The cost of any other processing should be delayed until required.

Misko Hevery agrees, citing problems finding seams for testing:

When your constructor has to instantiate and initialize its collaborators, the result tends to be an inflexible and prematurely coupled design. Such constructors shut off the ability to inject test collaborators when testing.

John Wigger cautions against putting too much code in a constructor:

In dealing with a more complicated OO design, it can be a mistake to put too much initialization logic into constructors. This is especially important for an OO design that uses inheritance significantly.

Gilad Bracha even goes so far as to hypothesize that constructors are harmful:

Constructors come with a host of special rules and regulations: they cannot be overridden like instance methods; they need to call another constructor, or a superclass constructor etc. Try defining mixins in the presence of constructors – it’s tricky because the interface for creating instances gets bundled with the interface of the instances themselves.