DaedTech

Stories about Software

By

Constructor Overloads: Know When to Say When

Paralysis By Options

Do you ever find yourself in a situation where some API or another requires you to instantiate an object? (If you’re reading this blog, the answer is probably “yes”). What do you usually do at this point? Instantiate it, compile, and make sure you’re good before poking around to see what your new object has to offer, usually in the form of auto-complete/intellisense? I think that’s what most would do. Word DOC APIs and other such things are all well and good as a backup plan, but let’s get serious – you want to play with the object and read the instructions only if you can’t figure out what to do. And the last thing you want to do is go reading the code of that class or, worse still, hunt down the guy that wrote it.

But, what about those times that the instantiation gets a little sidetracked? You go to instantiate the object and it’s like wandering into a Baskin Robbins knowing only that you vaguely feel like ice cream. So many flavors to choose from, but which is the right one?

In the picture above, I’ve decided I want an Aquarium object, and Intellisense informs me that there are no less than 11 ways that I can make this happen. That’s right, 11. My immediate, gut reaction to this information is to go off to implement the “AdoptADog” method instead and put this nonsense off until later.

But Aren’t More Choices Better?

With constructors, no, not really. I’ve talked before about the problem with bloated constructors and my opinion that a constructor should do nothing but ensure that the object initializes with class level variants established. With that in mind, either some of these overloads are doing more than is necessary or else some of them fail to meet this basic criteria. The former is pointless speculative coding and the latter means that your objects can be instantiated in states that are not valid. Either one of these is a problem.

I believe there is a tendency, especially if you don’t practice TDD or even write unit tests at all, to go off on tangents about how developers may want to instantiate objects. Maybe developer X will want to instantiate an aquarium with all defaults whereas developer Y will want to specify how many gallons it holds and how many fish are in it. Maybe developer Z just wants to initialize with the kind of rocks that go in the bottom or the kind of light that shines on top. Maybe everyone wants to initialize specifying salt or fresh water. Let’s think of every combination of things anyone may want to do to this object and offer them all up as constructor overloads, right?

But you know what? That’s what the public API is for with accessors and mutators. Everyone can do it that way. Save the constructor for things without which the aquarium makes no sense (e.g. capacity) and let everyone call a property setter or a mutator for the rest. C# even has some syntactic sugar for just this occasion.

If you add in a bunch of overloads, you may think that you’re being helpful, but you’re really just muddying the waters and paralyzing your clients with options. I may want to instantiate an aquarium and use it to hold a bunch of dirt from my back yard — so why I am I being offered all of these options about fish and water and aquarium plants and plastic divers? I don’t care about any of that. But, I’ll hesitate to omit it because for all I know I should instantiate the object with those things. I mean, with all of those overloads, some are probably vestigial or at least less frequently used. I don’t want to use something that might be deprecated or untested and nobody wants to maintain a bunch of methods that may never even be used.

In the end, what I’ll wind up doing is digging out the word document that describes this thing or going to the developer who wrote it and asking which one to use. And that sucks. If you offer me only one option — the minimal constructor that establishes the invariants and forces any critical dependencies on the client — I’ll use that option and go on my merry way. There will be nothing to think about and certainly nothing to read word documents or send emails about. And that is the essence of providing usable code and good abstractions.

(And incidentally, since Visual Studio 2010, C# has really taken away any good excuse for a lot of overloads with optional/default parameters).

By

TDD Prevents Copy and Paste Programming

Copy, Paste, Fail, Repeat

Today, I was reviewing some code, and I saw a “before” that looked like this:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

The first thing that stuck out to me was that I really wanted to stick “var” in there a couple of times because I read the word “classifier” so many time it lost all meaning. The second thing that struck me was the after:

private void RefreshClassifierValues(Array NewValues)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Value = NewValues.GetValue(i++);
    }
}

private void RefreshClassifierEnabledStatus(bool value)
{
    List ClassifiersList = Classifiers;
    int i = 0;
    foreach (Classifier classifier in ClassifiersList)
    {
        classifier.Enabled = value;
    }
}

If you’d like to play a game of “find the compiler warning”, by all means, go ahead, but I warn you — the next sentence is a spoiler. In the new code, “i” is never read anywhere, which will generate a warning about unused variables. Annoying, but not the end of the world, and a fairly quick fix.

But, how did the code get like this? I’ve isolated the change in a way that should make this fairly obvious if you think about it for a minute or two. Basically, someone took the first method, copy and pasted it, and modified the payload of the foreach to suit his or her needs. Of course, the new need didn’t involve reading the local index variable, so oops.

I talked with the developer whose code change it was, doing my due diligence in pointing out that this is a classic example of one of the many problems with copy/paste programming. I tried to think back to a time when I’d been burned by this sort of thing recently, when it dawned on me that this simply wouldn’t happen to me. I say that not because I’m some kind of purist that disables ctrl-V and never pastes any code anywhere (though I do keep this to a pretty strict minimum), but rather because this is simply a non-starter for someone who practices TDD.

TDD Is Full of Win

If I were going to write the method RefreshClassifierStatus, the first test that I would write is that Classifier[0].Enabled was false when I passed in false. That would go red, and then I’d write some obtuse code that set Classifier[0].Enabled to false. Next, I’d write a test that said it was true when I passed in true and, after failing, I’d set that property to the passed in value. Finally, I’d write a test that Classifier[Classifier.Count – 1].Enabled was true when true was passed in and suddenly I’d have a foreach loop executing properly.

As an aside, at this point, I’d set the list to null and write a test for how the method should behave when that happens. This seems not to have been considered in this code, which is another problem with copy/paste programming — it’s a multiplier for existing mistakes.

So where in this, exactly, do I introduce an unused variable? That’s never going to help me get any test to go green. Where in this do I ever copy and paste the method wholesale? That’s not the simplest thing that I can do. So, with these two guiding heuristics, followed rather strictly, it’s simply impossible for me to introduce needless noise into the code. TDD forces an impressive level of efficiency most of the time.

TDD is Full of Fast

Had I written this method, I would have done it in probably 5 minutes or so (and that’s a fairly pessimistic estimate, I think), generating 4 unit tests for my trouble. The person that wrote this, I’m guessing, spent 30 seconds or less doing it. So, I’m behind by 4.5 minutes. But, let’s add to that the time that this developer will spend after seeing the code review firing back up the development environment, navigating to the code, removing the line of code, re-compiling, and re-running to make sure that it’s still okay. I’m now only maybe 2 minutes behind.

Now, let’s also factor in that I just realized that problem about the null check, which I’m going to email over and suggest be fixed in the new and old methods. Now there are two test cases instead of 1, and I’m suddenly ahead in terms of development time.

Okay, so the time here is a bit contrived and it’s not as though mistakes can’t be made with TDD, but I can say that a lot fewer of them are. The point here is that copy/paste programming is supposed to be super fast. It’s what you do when there’s no time for good software development practice because you need to hurry. And yet, it’s really not that fast in the grand scheme of things. It just makes you busier and wrong more efficiently than being wrong by hand.

So, save yourself time, effort, and the headache of getting a reputation for sloppy work. Slow down and do it right. TDD isn’t required for this, but it sure helps.

By

Beware the Bloated Constructor

What’s a Bloated Constructor?

Yesterday, I was going through the version history of a file in some code base (from earliest to most recent) and I saw the following:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    Side = es;

    try
    {
        if (product == null)
            throw new Exception("Can't create an active product from a null product");

        Logger.LogTimeStudyStart("ActiveProduct create");

        if (driver.IsEmpty())
            CurrentDriver = product.Driver;
        else
            CurrentDriver = driver;

        _Device = Session.Instance.DeviceManager.CreateHI(es, CurrentDriver.ToString());
        _Device.ConnectionStatus += Device_ConnectionStatus;
        _Device.BatteryStatus += Device_BatteryStatus;
    }
    catch (Exception ex)
    {
        ExceptionHandler.HandleException(ex);
        throw ex;
    }
    finally
    {
        Logger.LogTimeStudyEnd("ActiveProduct create", "ActiveProduct: Creating " + driver);
    }

    State = ActiveProductState.Simulated;
    Environments = new SortedList();
    this._Product = product;
    _Options = product.AvailableOptions;
    AvailablePrograms = new ProgramSlots(this) { HIStartIndex = 0 };
    AccessoryPrograms = new ProgramSlots(this) { CanRemoveOverride = true, StartIndex = 4, IsAutoShifting = false };
    VirtualPrograms = new ProgramSlots(this) { IsVirtual = true, CanRemoveOverride = true, SlotConfig = Limit.None };
    SlotCalculator = new ProgramCalculator(this, false);
    VirtualCalculator = new ProgramCalculator(this, true);
    DFS = new DFSCalibration(this);
    Accessories = new Accessories();
    PowerBands = PowerBands.Unknown;
}

This made me sad and tired but no worries, I figured, there were a lot more revisions I hadn’t yet seen. Surely somebody did something about this later. And then, happiness:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    Initialize(product, driver, es);
}

Sweet! That’s much easier on the eyes. Let’s see just see what Initialize does and call it a day:

private void Initialize(Product product, StringVersion driver, Side es)
{
    Side = es;

    try
    {
        if (product == null)
            throw new Exception("Can't create an active product from a null product");

        Logger.LogTimeStudyStart("ActiveProduct create");

        if (driver.IsEmpty())
            CurrentDriver = product.Driver;
        else
            CurrentDriver = driver;

        _Device = Session.Instance.DeviceManager.CreateDevice(es, CurrentDriver.ToString());
        _Device.ConnectionStatus += Device_ConnectionStatus;
        _Device.BatteryStatus += Device_BatteryStatus;
    }
    catch (Exception ex)
    {
        ExceptionHandler.HandleException(ex);
        throw ex;
    }
    finally
    {
        Logger.LogTimeStudyEnd("ActiveProduct create", "ActiveProduct: Creating " + driver);
    }

    State = ActiveProductState.Simulated;
    Environments = new SortedList();
    this._Product = product;
    _Options = product.AvailableOptions;
    AvailablePrograms = new ProgramSlots(this) { DeviceStartIndex = 0 };
    AccessoryPrograms = new ProgramSlots(this) { CanRemoveOverride = true, HIStartIndex = 4, IsAutoShifting = false };
    VirtualPrograms = new ProgramSlots(this) { IsVirtual = true, CanRemoveOverride = true, SlotConfig = Limit.None };
    SlotCalculator = new ProgramCalculator(this, false);
    VirtualCalculator = new ProgramCalculator(this, true);
    DFS = new DFSCalibration(this);
    Accessories = new Accessories();
    PowerBands = PowerBands.Unknown;
}

Oh, the humanity… looks like someone sprayed a bit of cologne here and nothing more.

So, what’s the problem?

I contend that this is a fundamental design failure. Forget throwing an exception from within a try block as some kind of goto. Forget all of the static stuff like the logging and exception handling and general global state. Forget the singletons. Forget all of the other ancillary problems. The design failure is how complicated (convoluted) this constructor is. I see this enough that I think some people would call it a development pattern (I don’t think any would claim it rises to the level of “Design Pattern”), but I firmly believe that it is an anti-pattern.

This style of code is the product of procedural, command and control thinking that I blogged about some time back. It makes no distinction between creation/initialization of an object and operations of the object. These distinct activities are blurred as the bloated constructor generates logger messages, talks to singletons, instantiates other objects, etc.

From the perspective of the procedural programmer, the constructor should take the role of making sure the object wants for nothing, having all of its creature comforts and having its every desire and whim satisfied. In the object oriented style, where dependencies tend to be inverted, the constructor has a different and more Spartan role. Its only job is to make sure that object initializes into a state where it satisfies its basic invariants (in other words, it makes sure that the object instance starts off in a valid state and nothing more).

Let’s consider why the latter, object-oriented approach tends to be a better fit in object oriented languages (and, I would argue, in any state-based paradigm, but I’ll focus on OOP here).

1. Violating Developer Expectations

Imagine that you’re newly assigned to a code base and you want to start understanding how it works. An excellent way to do this is to write a few unit tests (perhaps to keep or perhaps just for throw-away experimentation). You pick some class that you’d like to test and figure you’ll experiment with its API before digging into its code. So, you create a new instance of that class and…

Oops. Your continuous testing tool crashes. Huh. Guess you’ll have to look in the code now (this is never a promising sign — if you have to inspect a class’s code to understand why it’s crashing, it’s probably not a good abstraction, especially if this is true of the constructor.) So, you look in the code and see that there’s a reference to a lazy-loading singleton that lazy-loads a few other singletons and one of those singletons is trying to read files off of a network somewhere and….

Alright, forget it. Let’s move onto another class. You instantiate that class and your testing tool doesn’t crash, but it does turn red. Huh. Back into the code. Ah, this time the class under test doesn’t touch any static stuff or singletons, but it instantiates another class that does. The exception handling is a little better here, so you just turn the testing tool red rather than completely crashing it, but this class is hosed for testing as well.

As a new developer, you’re instantiating classes and expecting that they’ll be instantiated. What’s happening instead is weird, bad, unpredictable things that force you to open the source code to figure out what on Earth is going on. You’re badly betraying a trust other developers put in you to do what you advertise (i.e. putting together the minimum invariants of the class).

2. Testability Nightmare

And, looking back at the narrative from the previous section, another problem makes itself pretty obvious as well. Constructors that do real work instantiating other things (especially global state) make classes impossible to test in isolation. Even if you don’t use global state and instead have classes all instantiate their collaborators, it’s still tough to test things in isolation.

Imagine that your code works like this: main tells a car to build itself, the car tells its engine and chassis to build themselves, the engine tells the alternator and transmission to build themselves, etc. Now, let’s say that you want to write a test of the car to see what happens when its engine starts out invalid. Just how will that be accomplished? You have no control over its engine — it controls that in its constructor. Even if it exposes engine via an accessor, too little, too late.

3. Casts Dependencies And Assumptions in Stone

If the constructor is busy, it’s likely that it’s talking to global state/and or instantiating things. If it takes arguments, it may be mutating those or paring them or coupling itself to them. But, whatever it’s doing, you have no control over it as a client. So, if you want this object at all, you’re getting it with whatever the constructor decides to foist on you.

Contrast this with any non-constructor method. Any non-constructor method you have the option to call or not, so you have the control. If you want a Car that isn’t started, you can simply not call Car.Start(). However, if the author of Car decides to start it in the constructor, you’re hosed. Not having a started car is not an option for you, buddy. Likewise, if Car instantiates a new SixCylinder() engine and you want a more fuel economical four cylinder, you’re SOL.

4. Performance Problems

Let’s say that I have some class, Critical, that I can’t live without in the application. I need to instantiate a Critical and access some property on it in response to a user request. Let’s say that Critical has a busy constructor that chugs and hums and does all manner of things, taking 5 or 10 seconds at times. Not all of this stuff has anything to do with the property that I want to access in response to the user request.

What are my choices here? Well, I don’t have any. I can like it or lump it. I need new Critical() and the user therefore needs to wait 5 or 10 seconds even though his request should take that number of microseconds. Sure, I can cache my Critical for next time or I can implement a flyweight, but that first time, I’m still needlessly forcing a wait on the user.

So, What Should a Constructor Look Like?

Given all of these arguments against bloated constructors, it should be fairly obvious what I think a constructor ought to look like. It ought to contain very little code. If the class has no dependencies, then it ought to contain no code (or not exist). After all, what are you going to do in the constructor that you can’t do in a field initializer? You’re not reaching into global state somewhere, are you?! Tsk tsk.

If a constructor has parameters, it ought to do nothing but set local fields according to those parameters, and perhaps perform some sort of validation on the parameters to ensure that its invariants can be satisfied with these values. That’s really it. The constructor’s only responsibility is to put the object into a valid state as far as its invariants are concerned, while doing the minimum amount of configuration possible. After all, you want objects where clients can explicitly tell them what to do. Even simple things like hooking to events should be triggered by a client method and not the constructor, ideally. It may seem a bit more cumbersome, but it’s also expressive and clear. Constructors hide side effects where well-named methods promote them from side effects to deliberate effects.

So What About that Constructor We Looked At?

How about this:

public ActiveProduct(Product product, StringVersion driver, Side es)
{
    if (product == null)
        throw new Exception("Can't create an active product from a null product");

    Side = es;
    CurrentDriver = driver;
    this._Product = product;
}

Wow, isn’t that like a ray of sunshine on an otherwise cloudy day? No Initialize() method hiding untold horrors. No nested, contradictory exception handling logic. No logging, no singletons, no static state — simple, minimal and easy to reason about. So, what to be done with all of these things that formerly resided in this constructor?

Well, a number of them were simply initializing properties or backing stores. This can be done with class initializers for sensible defaults and should otherwise be left to clients. After all, we’re here to serve them — not force settings on them. The logging makes no sense — do that somewhere else. If you’re writing a class and you want to know how long the constructor takes (which is a huge warning flag, by the way), write that code where you instantiate it — don’t force this on every conceivable client of your class.

The event wireup can be moved into some kind of “Hookup()” method to allow clients specifically to request this. You could also make it more granular if you like, allowing them to hook up ala carte instead of all at once. As for the singeltons and static state, leave that up to the instantiating client if it wants that done. We might as well leverage the thing that makes static state a design nightmare — the fact that anyone can do anything from anywhere — to move the stink away from us with some “Not in my backyard” attitude. If everyone does this, eventually the global state will be bubbled to some common location where singletons can be slain and static classes converted to instances.

Are You a Lone Crackpot, Erik?

As it turns out, no. Here is what others have to say on the subject:

Microsoft agrees with this take, citing performance:

Do minimal work in the constructor. Constructors should not do much work other than to capture the constructor parameters. The cost of any other processing should be delayed until required.

Misko Hevery agrees, citing problems finding seams for testing:

When your constructor has to instantiate and initialize its collaborators, the result tends to be an inflexible and prematurely coupled design. Such constructors shut off the ability to inject test collaborators when testing.

John Wigger cautions against putting too much code in a constructor:

In dealing with a more complicated OO design, it can be a mistake to put too much initialization logic into constructors. This is especially important for an OO design that uses inheritance significantly.

Gilad Bracha even goes so far as to hypothesize that constructors are harmful:

Constructors come with a host of special rules and regulations: they cannot be overridden like instance methods; they need to call another constructor, or a superclass constructor etc. Try defining mixins in the presence of constructors – it’s tricky because the interface for creating instances gets bundled with the interface of the instances themselves.

By

I Love Debugger

Learning to Love Bugs

My apologies for the titular pun in reference to “I Love Big Brother” of iconic, Orwellian fame, but I couldn’t resist. The other day, I was chatting with some people about the idea of factoring large methods into smaller, more focused ones and one of the people chimed in with an objection that was genuinely new to me.

Specifically, the objection was that giant methods tended to be preferable because it kept the stack trace flat and made it easier to have everything “all in one place” when you were (inevitably) going through the code in the debugger. My first, fleeting thought was to wonder if people really found it that difficult to ctrl-tab between classes, but I quickly realized that this was hardly the important problem here (and really, to each his or her own). The bigger problem, as I explained a moment later, but have thought through in a bit more detail for a blog post now, is that you’re writing code more likely to generate defects so that when you’re tasked with fixing those defects, you feel more comfortable.

This is like a general housing contractor saying, “I prefer to use sand as a building material over wood or brick for houses I build because it’s much easier to work with in the morning after the tide destroys the house each night.”

Winston realized that two equals four and that the only way to prevent bugs is to cause them. Wilson happily declared, “I love Debugger!”.

More Bugs? Prove It!

So, if you’re a connoisseur of strict logic in debating, you’ll notice that I’ve begged the question here with my objection. That is, I ‘proved’ the reasoning fallacious by assuming that larger methods means more bugs and then used that ‘proof’ as evidence that larger methods should be avoided. Well, fear not. A group of researches from Standford did an empirical analysis of OS bugs, and found:

Figure 5 shows that as functions grow bigger, error rates increase for most checkers. For the Null checker, the largest quartile of functions had an average error rate almost twice as high as the smallest quartile, and for the Block checker the error rate wEis about six times higher for larger functions. Function size is often used as a measure of code complexity, so these results confirm our intuition that more complex code is more error-prone.

Some of our most memorable experiences examining error reports were in large, highly complex functions with contorted control flow. The higher error rate for large functions makes a case for decomposition into smaller, more understandable functions.

This finding is not unique, though it nicely captures the issue. During my time in graduate school in a class on advanced topics in software engineering, we did a unit on the relationship between various coding practices and likelihood of bugs. A consistent theme is that as function size grows, number of defects per line of code grows (in other words, the number of defects per function grows faster than the number of lines per function).

So, What Now?

In the end, my response is quite simply this: get used to a more factored and distributed paradigm. Don’t worry about being lost in files and stack traces in the debugger. Why not? Well, because if you follow Uncle Bob Martin’s advice about factoring methods to be 4 or 5 lines, you wind up with methods that descriptively tell you what they’re going to do and do it perfectly. In other words, you don’t need to step into them because they’re too simple and concise for things to go wrong.

In this fashion, your debugging becomes different. You don’t have a pen and paper, a spreadsheet, a stack trace window, and a row after row of “immediates” all to keep track of what on Earth is going on. You set a breakpoint somewhere, and any method calls are innocent until proven guilty. You step over everything until something fishy happens (or until you become a client of some lumbering beast of a method that someone else wrote, which is virtually assured of having defects). This approach is almost universally rejected at first but infectious with time. I know that, as a “no bigger than the screen” guy originally, my initial reaction to the idea of all methods being 4 or 5 lines was “that’s stupid”. But try it sometime and you won’t go back.

Bye, Bye Debugger!

If you combine small factored methods and unit tests (which tend to have a natural synergy), you will find that your debugger skills begin to atrophy. Rather than reasoning about the code at runtime, you reason about it at compile time. And, that’s a powerful and important concept.

Reasoning about code at run time is programming by coincidence, as made famous by one of my favorite programming books. I mean, think about it — if you need the debugger to understand what the state of the code is and what’s going on, what you’re really saying when you build and run is, “I have no idea what this code is going to do by inspecting it, so I need to run the entire application to understand it.” You don’t understand your own code while you’re writing it. That’s a problem!

When you write small, factored methods and generally tested and decoupled code, you don’t have this problem. Take this to its logical conclusion and imagine a method that takes two int parameters and returns an int representing their sum. Do you need to set breakpoints and watches, tag immediate variables and look at a stack trace to know what this method will do? Of course not! You can reason about this method at compile time and take for granted that it will do the right thing at run time. When you write code like this, you’re telling the application how to behave. When you find yourself immersed in the debugger for three quarters of your day, you’re not dictating how the application will behave. Instead, you’re begging it to work as a kind of prayer since it’s pretty much out of your hands what’s going to happen. Don’t buy it? How many times have you been at your desk with a deadline looming saying “please, just work — why won’t you just work!?!”

This isn’t to say that I never use the debugger. But, with a combination of TDD, a continuous testing tool, and small, factored methods, it’s fairly rare. Generally, I just use it when my stuff is integrated with code not done this way. For my own stuff, if I ever do use it, it’s from the entry point of a unit test and not the application.

The cleaner the code that I write, the more my debugger skills atrophy. I watch in amazement at peers that are incredible with the debugger — and I say that with no irony. Some of them can get it to do things I didn’t realize were possible and that I freely admit are very cool. I don’t know how to do these things because I’m out of practice. But, I consider that good. If you’re getting a lot of practice de-bug-ing your code, it means you’re getting a lot of practice writing code with bugs in it.

So, let’s keep those methods small and get out of the practice of generating bugs.

(By the way, I’m going to be traveling overseas for the next couple of weeks, so this may be my last post for a while).

By

Abstractions are Important

I was helping someone troubleshoot an issue today, digging through code, and I came across a double-take-inducing design decision. In the GUI, there was a concept of feature, and each feature was being bound to something called FeatureGroup which was a collection of features that, at run-time, only ever contained one feature. So, as a markup-writing client interested in displaying a single feature, I have to bind to the first feature in a group of features that has a size greater than zero and less than or equal to one. This is as opposed to binding to, well, a feature. I’m sure there is some explanation for this, but I don’t want to know what it is. Seriously. I’m not interested.

The reason that I’m not interested is neither frustration, nor is it purism of any kind. I’m not interested because it doesn’t matter what the explanation is. No matter what it is, the reaction by anyone who stumbles across it later is going to be the same:

Everyone who encounters this code is going to have the same reaction I did: “what the…?!? Why?!?” At this point, people may react in various ways. More industrious people would write a new presentation layer abstraction and phase this one out. Others might seek out the original designer and ask an explanation, listening skeptically and resigning themselves to reluctant clienthood. Still others might blindly mimic what’s going on in the surrounding area, programming by coincidence in the hopes of getting things right. But what nobody is going to do is say “yep, this makes sense, and it’s a good, solid base for building further understanding of the application.” And, since that’s the case — since this abstraction won’t make any sense even with some helpful prodding — I don’t want to hear about the design struggles, technology limitations, or whatever else led to this point. It’s only going to desensitize me to a bad abstraction and encourage me to further it later.

Your code is only as good as the abstractions that define it. This is true whether your consumers are end-users, UI designers, or other developers. It doesn’t matter if you’ve come up with the most magical, awesome, efficient or slick internal algorithm if you have a bad outward-facing set of abstractions because people’s reactions will range from avoidance to annoyance, but not appreciation. I’ve touched on this before, tangentially. On the flip side, clients will tend to appreciate an intuitive API, regardless of what it does or doesn’t do under the hood.

My point here isn’t to encourage marketing or salesmanship of one’s own code to other developers, per se, but rather to talk about what makes code “good”. If you are a one-person development team or a hobbyist, this is all moot anyway, and you’re free to get your abstractions wrong until the cows come home, but if you’re not, good abstractions are important. As a developer, ask yourself which you’d rather use (these are not real code, I just made them up):

public interface GoodAbstractions
{
    public void Add(Customer customerToAdd);

    public void Delete(Customer customerToDelete);

    public void Update(Customer customerToUpdate);

    public IEnumerable Find(Predicate searchCriteria);
}

]

or

public interface BadAbstractions
{
    public void Add(int customerId, string customerName);

    public void Delete(int customerId);

    public void Delete(Customer customer, string customerId, bool shouldValidate = false);

    public void Update(Customer customer);

    public void OpenDatabase(string connection);

    public bool ShouldUseFileInsteadOfDatabase { get; set; }

    public List GetAllCustomers();

    public IEnumerable GetAllCustomersAsEnumerable();

    public bool Connect();

    public List GetAllDatabaseRecords(bool isSqlServer);

    public List GetSingleCustomer(int customerId);

    public void Close(int handle, bool shouldClose);

    public void Close(int handle, bool shouldClose, bool alt);
}

I don’t think there’s any question as to which you’d rather use. The second one is a mess — I can hear what you’re thinking:

  1. “Connect to what?”
  2. “What in the world is ‘alt’?!?”
  3. “Why do some mutators return nothing and others bool?”
  4. “Why does Close have a boolean to tell it whether or not you want to close — of course you do, or you wouldn’t call Close!”
  5. “Why are there two deletes that require substantially different information — is one better somehow?”
  6. “What does that thing about files do?”
  7. “Why does add want only some fields?”

Notice the core of the objections has to do with abstractions. Respectively:

  1. There is Open() and Close(), but no bookend for Connect(), so it’s a complete mystery what this does and if you should use it.
  2. The second overload of alt adds a mysterious parameter that seems to indicate this overload is some kind of consolation prize method or something, meaning a possible temporal dependency.
  3. There appears to be some ad-hoc mixture of exception and error code error handling.
  4. Close wants a state flag — you need to keep track of this thing’s internal state for it (inappropriate intimacy).
  5. Does this interface want ad-hoc primitives or first class objects? It can’t seem to make up its mind what defines a Customer.
  6. The file stuff makes it seem like this class is a database access class retrofitted awkwardly for a corner case involving files, which is a completely different ballgame.
  7. The rest of the operations have at least one overload that deals with Customer, but Add doesn’t, indicating Add is somehow different than the other CRUD operations

Also, in a broader sense, consider the mixture of layering concepts. This interface sometimes forces you to deal with the database (or file) directly and sometimes lets you deal with business objects. And, in some database operations, it maintains its own state and in some it asks for your help. If you use this API, there is no clear separation of your responsibilities from its responsibilities. You become codependent collaborators in a terrible relationships.

Contrast this with the first interface. The first interface is just basic CRUD operations, dealing only with a business object. There is no concept of database (or any persistence here). As a client of this, you know that you can request Customers and mutate them as you need. All other details (which primitives make up a customer, whether there is a file or a database, whether we’re connected to anything, whether anything is open, etc) are hidden from us. In this API, the separation of responsibilities is extremely clear.

If confronted with both of these API, all things being equal, the choice is obvious. But, I submit that even if the clean API is an abstraction for buggy code and the second API for functional code, you’re still better off with the first one. Why? Simply because the stuff under the hood that’s hidden from you can (and with a clean API like this, probably will) be fixed. What can’t be fixed is the blurring of responsibilities between your code and the confusion at maintenance time. The clean API draws a line in the sand and says “business logic is your deal and persistence is mine.” The second API says, “let’s work closely together about everything from the details of database connections all the way up to business logic, and let’s be so close that nobody knows where I begin and you end.” That may be (creepily) romantic, but it’s not the basis of a healthy relationship.

To wit, the developers using the second API are going to get it wrong because it’s hard to get it right. Fixing bugs in it will turn into whack-a-mole because developers will find weird quirks and work-arounds and start to depend on them. When responsibilities are blurred by mixed, weird, or flat-out-wrong abstractions, problems in the code proliferate like infectious viruses. In short, the clean abstraction API has a natural tendency to improve, and the bad abstraction API has a natural tendency to degenerate.

So please, I beg you, consider your abstractions. Apply a “golden rule” and force onto others only abstractions you’d want forced on yourself. Put a little polish on the parts of your code that others are going to be using. Everyone will benefit from it.