DaedTech

Stories about Software

By

Abstractions are Important

I was helping someone troubleshoot an issue today, digging through code, and I came across a double-take-inducing design decision. In the GUI, there was a concept of feature, and each feature was being bound to something called FeatureGroup which was a collection of features that, at run-time, only ever contained one feature. So, as a markup-writing client interested in displaying a single feature, I have to bind to the first feature in a group of features that has a size greater than zero and less than or equal to one. This is as opposed to binding to, well, a feature. I’m sure there is some explanation for this, but I don’t want to know what it is. Seriously. I’m not interested.

The reason that I’m not interested is neither frustration, nor is it purism of any kind. I’m not interested because it doesn’t matter what the explanation is. No matter what it is, the reaction by anyone who stumbles across it later is going to be the same:

Everyone who encounters this code is going to have the same reaction I did: “what the…?!? Why?!?” At this point, people may react in various ways. More industrious people would write a new presentation layer abstraction and phase this one out. Others might seek out the original designer and ask an explanation, listening skeptically and resigning themselves to reluctant clienthood. Still others might blindly mimic what’s going on in the surrounding area, programming by coincidence in the hopes of getting things right. But what nobody is going to do is say “yep, this makes sense, and it’s a good, solid base for building further understanding of the application.” And, since that’s the case — since this abstraction won’t make any sense even with some helpful prodding — I don’t want to hear about the design struggles, technology limitations, or whatever else led to this point. It’s only going to desensitize me to a bad abstraction and encourage me to further it later.

Your code is only as good as the abstractions that define it. This is true whether your consumers are end-users, UI designers, or other developers. It doesn’t matter if you’ve come up with the most magical, awesome, efficient or slick internal algorithm if you have a bad outward-facing set of abstractions because people’s reactions will range from avoidance to annoyance, but not appreciation. I’ve touched on this before, tangentially. On the flip side, clients will tend to appreciate an intuitive API, regardless of what it does or doesn’t do under the hood.

My point here isn’t to encourage marketing or salesmanship of one’s own code to other developers, per se, but rather to talk about what makes code “good”. If you are a one-person development team or a hobbyist, this is all moot anyway, and you’re free to get your abstractions wrong until the cows come home, but if you’re not, good abstractions are important. As a developer, ask yourself which you’d rather use (these are not real code, I just made them up):

public interface GoodAbstractions
{
    public void Add(Customer customerToAdd);

    public void Delete(Customer customerToDelete);

    public void Update(Customer customerToUpdate);

    public IEnumerable Find(Predicate searchCriteria);
}

]

or

public interface BadAbstractions
{
    public void Add(int customerId, string customerName);

    public void Delete(int customerId);

    public void Delete(Customer customer, string customerId, bool shouldValidate = false);

    public void Update(Customer customer);

    public void OpenDatabase(string connection);

    public bool ShouldUseFileInsteadOfDatabase { get; set; }

    public List GetAllCustomers();

    public IEnumerable GetAllCustomersAsEnumerable();

    public bool Connect();

    public List GetAllDatabaseRecords(bool isSqlServer);

    public List GetSingleCustomer(int customerId);

    public void Close(int handle, bool shouldClose);

    public void Close(int handle, bool shouldClose, bool alt);
}

I don’t think there’s any question as to which you’d rather use. The second one is a mess — I can hear what you’re thinking:

  1. “Connect to what?”
  2. “What in the world is ‘alt’?!?”
  3. “Why do some mutators return nothing and others bool?”
  4. “Why does Close have a boolean to tell it whether or not you want to close — of course you do, or you wouldn’t call Close!”
  5. “Why are there two deletes that require substantially different information — is one better somehow?”
  6. “What does that thing about files do?”
  7. “Why does add want only some fields?”

Notice the core of the objections has to do with abstractions. Respectively:

  1. There is Open() and Close(), but no bookend for Connect(), so it’s a complete mystery what this does and if you should use it.
  2. The second overload of alt adds a mysterious parameter that seems to indicate this overload is some kind of consolation prize method or something, meaning a possible temporal dependency.
  3. There appears to be some ad-hoc mixture of exception and error code error handling.
  4. Close wants a state flag — you need to keep track of this thing’s internal state for it (inappropriate intimacy).
  5. Does this interface want ad-hoc primitives or first class objects? It can’t seem to make up its mind what defines a Customer.
  6. The file stuff makes it seem like this class is a database access class retrofitted awkwardly for a corner case involving files, which is a completely different ballgame.
  7. The rest of the operations have at least one overload that deals with Customer, but Add doesn’t, indicating Add is somehow different than the other CRUD operations

Also, in a broader sense, consider the mixture of layering concepts. This interface sometimes forces you to deal with the database (or file) directly and sometimes lets you deal with business objects. And, in some database operations, it maintains its own state and in some it asks for your help. If you use this API, there is no clear separation of your responsibilities from its responsibilities. You become codependent collaborators in a terrible relationships.

Contrast this with the first interface. The first interface is just basic CRUD operations, dealing only with a business object. There is no concept of database (or any persistence here). As a client of this, you know that you can request Customers and mutate them as you need. All other details (which primitives make up a customer, whether there is a file or a database, whether we’re connected to anything, whether anything is open, etc) are hidden from us. In this API, the separation of responsibilities is extremely clear.

If confronted with both of these API, all things being equal, the choice is obvious. But, I submit that even if the clean API is an abstraction for buggy code and the second API for functional code, you’re still better off with the first one. Why? Simply because the stuff under the hood that’s hidden from you can (and with a clean API like this, probably will) be fixed. What can’t be fixed is the blurring of responsibilities between your code and the confusion at maintenance time. The clean API draws a line in the sand and says “business logic is your deal and persistence is mine.” The second API says, “let’s work closely together about everything from the details of database connections all the way up to business logic, and let’s be so close that nobody knows where I begin and you end.” That may be (creepily) romantic, but it’s not the basis of a healthy relationship.

To wit, the developers using the second API are going to get it wrong because it’s hard to get it right. Fixing bugs in it will turn into whack-a-mole because developers will find weird quirks and work-arounds and start to depend on them. When responsibilities are blurred by mixed, weird, or flat-out-wrong abstractions, problems in the code proliferate like infectious viruses. In short, the clean abstraction API has a natural tendency to improve, and the bad abstraction API has a natural tendency to degenerate.

So please, I beg you, consider your abstractions. Apply a “golden rule” and force onto others only abstractions you’d want forced on yourself. Put a little polish on the parts of your code that others are going to be using. Everyone will benefit from it.

By

A Metaphor for Software

I’ve recently been reading Code Complete┬áby Steve McConnell (review of the book to follow when I’ve finished). In one of the early chapters of the book, McConnell discusses software metaphors and their importance. I won’t get into too much detail here, and I’m operating off the top of my head from what I read a couple of weeks ago, but he lists various metaphors from least useful/appropriate to most useful/appropriate. As I recall, the least useful was “writing” code, in the sense that one sits down and composes a letter, and the most useful was that of building construction.

That is to say, a good metaphor for the process of software engineering is the process of, well, building engineering. There is a good amount of up-front planning (since this was written in 2004, before agile processes gained the following that they have today, this seems like a reasonable thing to write), a sense of architecture and framing, scheduling issues representable by Gantt charts, a quality process, etc. I think that this is certainly a good metaphor. In this chapter, he also encourages the reader to invent his own metaphors as this ongoing collective exercise will be useful for describing the software engineering process in the future.

I’ve been letting that percolate in my head the last couple of weeks, thinking about the rise of the aforementioned agile processes between the writing of the book and now, as well as the nature of software itself. What I’ve arrived at for my own metaphor is a work in progress, so please bear with me and feel free to contribute your own thoughts, as I would enjoy working through this as an exercise to improve the way I think about development.

The metaphor that I’m considering is one in which the software is an amorphous blob of children in a school system, the software engineers are teachers, the software architect is, perhaps a principal, the project managers are deans or the equivalent, and marketing people are the political school board types that set the curriculum. The customer stakeholders in this scenario are the parents of the children, let’s say, and to a lesser extent, society at large.

The basic idea here is that software is a student of sorts. When we develop, we start with software that does (knows, in the metaphor parlance) nothing. We then take a set of requirements set by some mixture of marketing people and customers (school administrators and parents, driving the administrators politically) and set about ‘teaching’ the software to do what we want it to. At first, the software (student) is lovably inept, but as we teach it how to satisfy more and more of the requirements, it begins to resemble the end product that we desire (a student or group of students ready to function in society as productive adults, able to perform the tasks that we have taught).

I like this metaphor for a few reasons. Number one, it seems to describe agile methodology better. The curriculum/requirements may constantly be refined and changed throughout the education/development process. Secondly, we really do teach our applications to do things — much more so than we build them to exist. With the construction metaphor, our code-house doesn’t do anything — it just exists, and it exists more as we build it. With the education metaphor, our code gets better at performing the tasks that it is slated to perform. Third, I think this naturally captures the politics of the stakeholders in the software engineering process much better than the construction metaphor, whose only politics exist between the primary agent of construction and the purchaser — the software company and its customer, respectively. I suppose you could laboriously describe a scenario in which the construction company has a sales office that promises things to the purchaser of the building and the sub-contractors balk, but that’s fairly limited. In the case of the education system, overbearing parents or administrators with agendas frequently clash with the teachers over the best way to educate the students — over what the software should know/do in the end.

So, what about Agile vs Waterfall? Well, admittedly, that gets a bit more abstract, but the construction metaphor is always, almost by definition, waterfall. Not very many houses start as fully functional, small studio apartments and grow upwards, downwards and outwards from there. But children being educated start with a core, accurate set of lessons that expand as time goes by. Throughout the course of this process, they behave like increasingly sophisticated mini-adults.

As I write this, it occurs to me that perhaps the iconic metaphor is well-suited for waterfall shops and this one for agile ones. The education metaphor becomes a bit labored with waterfall, though certainly not untenable. I suppose a waterfall education would be like a schoolboard saying, “all right, all children entering kindergarten now have their curriculum set through 2024, so nothing else will be added between now and then, though we will leave an hour per week so that they may learn about whatever gets invented between now and then.” (Hey, learning about things going on now is more of a college kind of thing anyway.) But then, perhaps the metaphor is appropriate, as I think it’s pretty well agreed upon that a pure waterfall approach has a natural tendency to result in software that is obsolete the day it leaves the shop. Either that, or the waterfall nature is not strictly practiced — the school system relents and decides that perhaps current events are reasonable for the curriculum, so long as they’re particularly important — the breakup of the USSR makes the cut, but the latest correctional struggles of Lindsay Lohan do not.

I will come back to expand on this post in the future as I refine the metaphor, and am happy to discuss the issue with anyone who may have comments.