DaedTech

Stories about Software

By

Abstractions are Important

I was helping someone troubleshoot an issue today, digging through code, and I came across a double-take-inducing design decision. In the GUI, there was a concept of feature, and each feature was being bound to something called FeatureGroup which was a collection of features that, at run-time, only ever contained one feature. So, as a markup-writing client interested in displaying a single feature, I have to bind to the first feature in a group of features that has a size greater than zero and less than or equal to one. This is as opposed to binding to, well, a feature. I’m sure there is some explanation for this, but I don’t want to know what it is. Seriously. I’m not interested.

The reason that I’m not interested is neither frustration, nor is it purism of any kind. I’m not interested because it doesn’t matter what the explanation is. No matter what it is, the reaction by anyone who stumbles across it later is going to be the same:

Everyone who encounters this code is going to have the same reaction I did: “what the…?!? Why?!?” At this point, people may react in various ways. More industrious people would write a new presentation layer abstraction and phase this one out. Others might seek out the original designer and ask an explanation, listening skeptically and resigning themselves to reluctant clienthood. Still others might blindly mimic what’s going on in the surrounding area, programming by coincidence in the hopes of getting things right. But what nobody is going to do is say “yep, this makes sense, and it’s a good, solid base for building further understanding of the application.” And, since that’s the case — since this abstraction won’t make any sense even with some helpful prodding — I don’t want to hear about the design struggles, technology limitations, or whatever else led to this point. It’s only going to desensitize me to a bad abstraction and encourage me to further it later.

Your code is only as good as the abstractions that define it. This is true whether your consumers are end-users, UI designers, or other developers. It doesn’t matter if you’ve come up with the most magical, awesome, efficient or slick internal algorithm if you have a bad outward-facing set of abstractions because people’s reactions will range from avoidance to annoyance, but not appreciation. I’ve touched on this before, tangentially. On the flip side, clients will tend to appreciate an intuitive API, regardless of what it does or doesn’t do under the hood.

My point here isn’t to encourage marketing or salesmanship of one’s own code to other developers, per se, but rather to talk about what makes code “good”. If you are a one-person development team or a hobbyist, this is all moot anyway, and you’re free to get your abstractions wrong until the cows come home, but if you’re not, good abstractions are important. As a developer, ask yourself which you’d rather use (these are not real code, I just made them up):

public interface GoodAbstractions
{
    public void Add(Customer customerToAdd);

    public void Delete(Customer customerToDelete);

    public void Update(Customer customerToUpdate);

    public IEnumerable Find(Predicate searchCriteria);
}

]

or

public interface BadAbstractions
{
    public void Add(int customerId, string customerName);

    public void Delete(int customerId);

    public void Delete(Customer customer, string customerId, bool shouldValidate = false);

    public void Update(Customer customer);

    public void OpenDatabase(string connection);

    public bool ShouldUseFileInsteadOfDatabase { get; set; }

    public List GetAllCustomers();

    public IEnumerable GetAllCustomersAsEnumerable();

    public bool Connect();

    public List GetAllDatabaseRecords(bool isSqlServer);

    public List GetSingleCustomer(int customerId);

    public void Close(int handle, bool shouldClose);

    public void Close(int handle, bool shouldClose, bool alt);
}

I don’t think there’s any question as to which you’d rather use. The second one is a mess — I can hear what you’re thinking:

  1. “Connect to what?”
  2. “What in the world is ‘alt’?!?”
  3. “Why do some mutators return nothing and others bool?”
  4. “Why does Close have a boolean to tell it whether or not you want to close — of course you do, or you wouldn’t call Close!”
  5. “Why are there two deletes that require substantially different information — is one better somehow?”
  6. “What does that thing about files do?”
  7. “Why does add want only some fields?”

Notice the core of the objections has to do with abstractions. Respectively:

  1. There is Open() and Close(), but no bookend for Connect(), so it’s a complete mystery what this does and if you should use it.
  2. The second overload of alt adds a mysterious parameter that seems to indicate this overload is some kind of consolation prize method or something, meaning a possible temporal dependency.
  3. There appears to be some ad-hoc mixture of exception and error code error handling.
  4. Close wants a state flag — you need to keep track of this thing’s internal state for it (inappropriate intimacy).
  5. Does this interface want ad-hoc primitives or first class objects? It can’t seem to make up its mind what defines a Customer.
  6. The file stuff makes it seem like this class is a database access class retrofitted awkwardly for a corner case involving files, which is a completely different ballgame.
  7. The rest of the operations have at least one overload that deals with Customer, but Add doesn’t, indicating Add is somehow different than the other CRUD operations

Also, in a broader sense, consider the mixture of layering concepts. This interface sometimes forces you to deal with the database (or file) directly and sometimes lets you deal with business objects. And, in some database operations, it maintains its own state and in some it asks for your help. If you use this API, there is no clear separation of your responsibilities from its responsibilities. You become codependent collaborators in a terrible relationships.

Contrast this with the first interface. The first interface is just basic CRUD operations, dealing only with a business object. There is no concept of database (or any persistence here). As a client of this, you know that you can request Customers and mutate them as you need. All other details (which primitives make up a customer, whether there is a file or a database, whether we’re connected to anything, whether anything is open, etc) are hidden from us. In this API, the separation of responsibilities is extremely clear.

If confronted with both of these API, all things being equal, the choice is obvious. But, I submit that even if the clean API is an abstraction for buggy code and the second API for functional code, you’re still better off with the first one. Why? Simply because the stuff under the hood that’s hidden from you can (and with a clean API like this, probably will) be fixed. What can’t be fixed is the blurring of responsibilities between your code and the confusion at maintenance time. The clean API draws a line in the sand and says “business logic is your deal and persistence is mine.” The second API says, “let’s work closely together about everything from the details of database connections all the way up to business logic, and let’s be so close that nobody knows where I begin and you end.” That may be (creepily) romantic, but it’s not the basis of a healthy relationship.

To wit, the developers using the second API are going to get it wrong because it’s hard to get it right. Fixing bugs in it will turn into whack-a-mole because developers will find weird quirks and work-arounds and start to depend on them. When responsibilities are blurred by mixed, weird, or flat-out-wrong abstractions, problems in the code proliferate like infectious viruses. In short, the clean abstraction API has a natural tendency to improve, and the bad abstraction API has a natural tendency to degenerate.

So please, I beg you, consider your abstractions. Apply a “golden rule” and force onto others only abstractions you’d want forced on yourself. Put a little polish on the parts of your code that others are going to be using. Everyone will benefit from it.

By

The Slippery Scope

Roll Up Your Sleeves

Have you ever stared into a code mess? I don’t mean some little mess, but rather the kind of mess that, as Nietzsche put it, stares back into you if you stare at it for too long. You can almost feel it making you worse at programming. It’s the kind of thing that everyone agrees is a mess with no dissent. It’s the 10,000 line class or the class titled GlobalVariables that houses hundreds of them. It’s the kind of mess that leads to someone having you add a few more lines to that Godzilla class or a few more globals to the mix in order to get things done, and hating yourself as you do it. I think you’ve probably seen it. It’s an experience you don’t quickly forget, and one you may even commemorate with a submission to The Daily WTF.

There are a variety of ways that people respond to this state of affairs. Some will shrug and adopt a “when in Rome” attitude as they blithely break another window in this dilapidated building. Others will refuse to make things worse (at least inasmuch as the group political dynamic allows) without making things better either. Still others will apply the so-called “Boy Scout Rule” toward small, incremental improvements, fixing the application a few squashed globals and factored out methods at a time. But, there is another category of response to this, and it is the theme of my post today. That category is the rolling up of sleeves in an effort to “do this right, once and for all.”

If Some Is Good, More Is Better?

The people who take this latter approach, sticking with the Nietzsche theme, probably consider themselves uber-boy-scouts. Boy scouts aim to leave the code a little better than they find it, but uber-boy-scouts aim to leave the code a lot better. Curiously, however, boy scouts tend to get a lot done over time toward improvement, whereas uber-boy-scouts generally accomplish nothing. What gives? If some is good, isn’t more better?
Read More

By

Are Unit Tests Worth It?

The Unit Test Value Proposition

I gave a presentation yesterday on integrating unit tests into a build. (If anyone is interested in seeing it, feel free to leave a comment, and I’ll post relevant slides to slideshare or perhaps make the power point available for download). This covered the nuts and bolts of how I had added test running to the build machine as well as how to verify that a delivery wouldn’t cause unit test failures and thus break the build. For background, I presented some statistics about unit testing and the motivations for a test-guarded integration scheme.

One of the questions that came up during Q&A was from a bright woman in the audience who asked what percentage of development time was spent writing unit tests for an experienced test writer and for a novice writer. My response to this was that it would be somewhat slower going at first, but that an experienced TDD developer was just as fast doing both as a non-testing developer in the short term and faster in the long term (less debugging and defect fixing). From my own personal experience, this is the case.

She then asked a follow up question about what kind of reduction in defects it brought, and I saw exactly what she was driving at. This is why I mentioned that she is an intelligent woman. She was looking for a snap-calculation as to whether or not this was a good proposition and worth adopting. She wanted to know exactly how many defects would be avoided by x “extra” days of effort. If 5 days of testing saved 6 days of fixing defects, this would be worth her time. Otherwise, it wouldn’t.

An Understandable but Misguided Assessment

In the flow of my presentation (which wasn’t primarily about the merits of unit testing, but rather how not to break the build), I missed an opportunity to make a valuable point. I wasn’t pressing and trying to convince people to test if they didn’t want to. I was just trying to show people how to run the tests that did exist so as not to break the build.

Let’s consider what’s really being asked here. She’s driving at an underlying narrative roughly as follows (picking arbitrary percentages):

My normal process is to develop software that is 80% correct and 20% incorrect and declare it to be done. The 80% of satisfied requirements are my development, and the 20% of missed requirements/regressions/problems is part of a QA phase. Let’s say that I spend a month getting to this 80/20 split and then 2 weeks getting the rest up to snuff, for a total of 6 weeks of effort. If I can add unit testing and deliver a 100/0 split, but only after 7 weeks then the unit testing isn’t worthwhile, but if I can get the 100/0 split in under 6 weeks, then this is something that I should do.

Perfectly logical, right?

Well, almost. The part not factored in here is that declaring software to be done when it’s 80% right is not accurate. It isn’t done. It’s 80% done and 20% defective. But, it’s being represented as 100% done to external stakeholders, and then tossed over the fence to QA with the rider that “this is ‘done’, but it’s not done-done. And now, it’s your job to help me finish my work.”

So, there’s a hidden cost here. It isn’t the straightforward value proposition that can be so easily calculated. It isn’t just our time as developers — we’re now roping external stakeholders into helping us finish by telling them that we’ve completed our work, and that they should use the product as if it were reliable when it isn’t. This isn’t like submitting a book to an editor and having them perform quality assurance on it. In that scenario, the editor’s job is to find typos and your job is to nail down the content. In the development/QA work, your job is to ensure that your classes (units) do what you think they should, and it’s QA’s job to find integration problems, instances of misunderstood requirements, and other user-test type things. It’s not QA’s job to discover an unhandled exception where you didn’t check a method parameter for null — that’s your job. And, if you have problems like that in 20% of your code, you’re wasting at least two people’s time for the price of one.

Efficiency: Making More Mistakes in Less Time

Putting a number to this in terms of “if x is greater than y, then unit testing is a good idea” is murkier than it seems because of the waste of others’ time. It gets murkier still when concepts like technical debt and stakeholder trust of developers are factored in. Tested code tends to be a source of less technical debt given that it’s usually more modular, maintainable, flexible, etc. Tested code tends to inspire more confidence in collaborators as, you may run a little behind schedule here and there, but when things are delivered, they work.

On the flipside of that, you get into the proverbial software death march, particularly in less agile shops. Some drop-dead date is imposed for feature complete, and you frantically write duct-tape software up until that date, and then chuck whatever code grenade you’re holding over the QA wall and hope the shrapnel doesn’t blow back too hard on you. The actual quality of the software is a complete mystery and it may not be remotely close to shippable. It almost certainly won’t be something you’re proud to be associated with.

One of my favorite lines in one of my favorite shows, The Simpsons, comes from the Homer character. In an episode, he randomly decides to change his name to Max Power and assume a more go-getter kind of identity. At one point, he tells his children, “there are three ways of doing things: the right way, the wrong way, and the Max Power way.” Bart responds by saying, “Isn’t that just the wrong way?” to which Homer (Max Power) replies, “yes, but faster!”

That’s a much better description of the “value” proposition here. It’s akin to being a student and saying “It’s much more efficient to get grades of C and D because I can put in 10 hours per week of effort to do that, versus 40 hours per week to get As.” In a narrow sense that’s true, but in the broader sense of efficiency at being a good student, it’s a very unfortunate perspective.  The same kind of nuanced perspective holds in software development.  Sacrificing an objective, early-feedback quality mechanism such as unit tests in the interests of being more “efficient” just means that you’re making mistakes more efficiently.  And, getting more things wrong in the same amount of time is a process bug — not a feature.

So, for my money, the idea of making a calculation as to whether or not verifying your work is worthwhile misses the point.  Getting the software right is going to take you some amount of time X.  You have two options here.  The first option is to spend some fraction of X working and then claim to be finished when you’re not, at which point you’ll spend the other portion of the fraction “fixing” the fact that you didn’t finish.  The second option is to spend the full time X getting it right.

If you set a standard for yourself that you’re only going to deliver correct software, the timelines work themselves out.  If you have a development iteration that will take you 6 weeks to get right, and the business tells you that you only get 4, you can either deliver them “all” of what they want in 4 weeks with the caveat that it’s 33% defective, or you can say “well, I can’t do that for you, but if you pick this subset of features, I’ll deliver them flawlessly.”  Any management that would rather have the “complete” software with defect landmines littering 33% of the codebase than 2/3rds of the features done right needs to do some serious soul-searching.  It’s easy to sell excellent software with the most important 2/3rds of the features and the remaining third two weeks out.  It’s hard to sell crap at any point in time.

So, the real value proposition here boils down only to “do I want to be adept at writing unreliable software or do I want to be adept at writing software that inspires trust?”

By

Forget Design Documents

Waterfalls Take Time

I sat in on a meeting the other day, and heard some discussion about late breaking requirements and whether or not these requirements should be addressed in a particular release of software. One of discussion participants voiced an opinion in the negative, citing as rationale that there was not sufficient time to generate a document with all of these requirements, generate a design document to discuss how these would be implemented, and then write the code according to this document.

This filled me with a strange wistfulness. I’m actually not kidding – I felt sad in a detached way that I can’t precisely explain.

The closest that I can come is to translate this normal-seeming statement of process into real work terms. The problem and reason that a change couldn’t be absorbed was because there was no time to take the requirements, transpose them to a new document, write another document detailing exactly what to do, execute the instructions in the second document, and continually update the second document when it turned out not to be prescient.

Or, more concisely, a lot of time was required to restate the problem, take a guess at how one would solve it, and then continually and exhaustively refine that guess.

I think the source of my mild melancholy was the sense that, not only was this wasteful, but it also kind of sucks the marrow out of life (and not in the good, poetic way). If we applied this approach to cooking breakfast, it would be like taking a bunch of fresh ingredients out of the fridge, and then getting in the car with them and going to the grocery store to use them as a shopping list for buying duplicate ingredients. Then, with all the shopping prepared, it’s time to go home, get out the pots and pans, roll up your sleeves and… leave the kitchen to sit at your desk for an hour or two slaving over a written recipe.

Once the recipe is deemed satisfactory in principle by your house guests, cooking would commence. This would involve flour, herbs spices, eggs, and of course — lots more writing. If you accidentally used a pinch instead of a dash, it’d be time to update the recipe as you cooked.

Kinda sucks the life right out of cooking, and it’s hard to convince yourself that things wouldn’t have gone better had you just cooked breakfast and jotted down the recipe if you liked it.

Why Do We Do This?

If you’ve never read Samuel Taylor Coleridge’s, “The Rime of the Ancient Mariner”, you’ve missed out on a rather depressing story and the origin of the idea of an “albatross around your neck” as something that you get saddled with that bogs you down.

Long story short, during the course of this poem, the main character kills an albatross that he superstitiously believes to be holding him back, only to have the thing forced around his neck as both a branding and a punishment for angering spiritual entities with some clout. During the course of time wearing the albatross, all manner of misfortune befalls the mariner.

In the world of software development, (and ignoring the portion of the story where the mariner shoots the albatross), some of our documentation seems to become our albatross. These documents that once seemed our friends are very much our curse — particularly the design document.

No longer do they help us clarify our vision of the software as it is supposed to look, but rather they serve as obligatory bottlenecks, causing us frequent rework and cruelly distracting us by whispering in our ears echoing chants of “CMMI” or “ISO”. Getting rid of them by simply ceasing to do them seems simple to the outsider, but is thoroughly impossible for us software mariners. We soldier on in our doomed existence, listlessly writing and doing our best to keep these documents up to date.

Let’s Just Not Do It

Things need not be this bleak. Here’s a radical idea regarding the design document — just don’t make one. Now, I know what you’re thinking — how are you going to have any idea what to do if you don’t lay it all out in a blueprint? Well, simple. You’ll use a combination of up-front whiteboard sketching and test driven development. These two items will result in both better documentation, and far less wasted time.

President Eisenhower once said, “Plans are nothing, planning is everything.” This philosophy applies incredibly well to software design.

Capturing your design intentions at every step of the way is truly meaningless. Source control captures the actual state of the code, so who cares what the intentions were at a given moment? These ephemeral notions of intent are best written and left on whiteboards in the offices of developers to help each other collaborate and understand the system.

Does the end-user care about this in the slightest? Of course not! Does anyone at all care about these once they become outdated? Nope. So, why capture and update them as you go?

Capture reality when you’re done, and only as much as-is required by a given stakeholder. The more you document superfluously, the more documents you have that can get out of date. And, when your docs get out of date, you have a situation worse than no documentation — lying documentation.

The other main component of this approach is test driven development. Test driven development forces the creation and continuous updating of arguably the best form of documentation there is for fellow developers: examples. As Uncle Bob points out in the three rules of TDD, the target audience is developers, and given the choice between paragraphs/diagrams and example code, developers will almost without exception choose example code. And, example code/API is exactly what the by-product of TDD is. It’s a set of examples guaranteed to be accurate and current.

And, what’s more, TDD doesn’t just provide unit tests and an accurate sample of how to use the software — it drives good design. A TDD code base tends to be modular, decoupled and flexible simply by the emergent design through the process. This means that TDD as you go is likely to provide you with every bit as good a design as you could hope to have by rubbing your crystal ball and pretending to know exactly what 200 classes are going to be necessary up front.

Is it Really that Simple?

Well, maybe, maybe not. The solution can certainly be just that simple – ceasing to do something you’re in the habit of doing is excellent at forcing change. However, the whiteboard culture and the art of TDD certainly require some conscious practice.

So, you may want to take both for a test drive before simply scrapping your diligent design documentation. But, personally, I’d say “jump in – the water’s fine!” I don’t think I’ve ever regretted not bothering with an ‘official’ design document. I’m happy to draw up documents to communicate the function of system at a given moment (instruction manuals, package diagrams, quick overviews, whatever), but creating them to be predictive of functionality and updating them from inception to the end of time… no thank you.

The documents that we create are designed to promote understanding and help communication — not to serve as time-sucking millstones around our necks that we cite as reasons for not providing functionality to users.

By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.

By

10 Ways to Improve Your Code Reviews

For a change of pace, I thought I’d channel a bit of Cosmo and offer a numbered article today. I’ve asked by others to do a lot of code reviews lately, and while doing this, I’ve made some notes as to what works, what doesn’t, and how I can improve. Here are those notes, enumerated and distilled into a blog post.

  1. Divide and distribute. Have one person look for duplicate code chunks, one look for anti-patterns, one check for presence of best practices, etc. It is far easier to look through code for one specific consideration than it is to read through each method or line of code, looking for anything that could conceivably be wrong. This also allows some people to focus on coarser granularity concerns (modules and libraries) with other focused on finer (classes and methods). Reading method by method sometimes obscures the bigger picture and concentrating on the bigger picture glosses over details.
  2. Don’t check for capitalization, naming conventions and other minutiae. I say this not because I don’t care about coding conventions (which I kind of don’t), but because this is a waste of time. Static analysis tools can do this. Your build can be configured not to accept checkins/deliveries that violate the rules. This is a perfect candidate for automation, so automate it. You wouldn’t waste time combing a document for spelling mistakes when you could turn on spell-check, and the same principle applies here.
  3. Offer positive feedback. If the code review process is one where a developer submits code and then defends it while others try to rip it to pieces, things become decidedly adversarial, potentially causing resentment. But, even more insidiously, unmitigated negativity will tend to foster learned helplessness and/or get people to avoid code reviews as much as possible.
  4. Pair. If you don’t do it, start doing it from time to time. If you do it, do it more. Pairing is the ultimate in code review. If developers spend more time pairing and catching each other’s mistakes early, code reviews after the fact become quicker and less involved.
  5. Ask why, don’t tell what. Let’s say that someone gets a reference parameter in a method and doesn’t check it for null before dereferencing it. Clearly, this is bad practice. But, instead of saying “put a null check there”, ask, “how come you decided not to check for null — is it safe to assume that you callers never pass you null?” Obviously, the answer to that is no. And, the thing is, almost any programmer will realize this at that point and probably say “oh, no, that was a mistake.” The key difference here is that the reviewee is figuring things out on his or her own, which is clearly preferable to being given rote instruction.
  6. Limit the time spent in a single code review. Given that this activity requires collaboration and sometimes passive attention, attention spans will tend to wane. This, in turn, produces diminishing marginal returns in terms of effectiveness of the review. This isn’t rocket science, but it’s important to keep in mind. Short, focused code reviews will produce effective results. Long code reviews will tend to result in glossing over material and spacing out, at which point you might as well adjourn and do something actually worthwhile.
  7. Have someone review the code simply for first-glance readability/understanding. There is valuable information that can be mined from the reaction of an experienced programmer to new code, sight unseen. Specifically, if the reaction to some piece of the code is “what the…”, that’s a good sign that there are readability/clarity issues. The “initial impression” litmus test is lost once everyone has studied the code, so having someone capture that at some point is helpful.
  8. Don’t guess and don’t assume — instead, prove. Rather than saying things like “I think this could be null here” or “This seems like a bad idea”, prove those things. Unit tests are great. If you see a flaw in someone’s code, expose it with a failing unit test and task them with making it pass. If you think there’s a performance problem or design issue, support your contention with a sample program, blog post, or whitepaper. Opinions are cheap, but support is priceless. This also has the added benefit of removing any feelings of being subject to someone else’s whims or misconceptions.
  9. Be prepared. If this is a larger, meeting-oriented code review, the people conducting the review should have read at least some of the code beforehand, and should be familiar with the general design (assuming that someone has already performed and recorded the results from suggestion 7). This allows the meeting to run more efficiently and the feedback to be more meaningful than a situation where everyone is reading code for the first time. When this happens, things will get missed since people start to feel uncomfortable as everyone waits for them to understand.
  10. Be polite and respectful.    You would think that this goes without saying, but sadly, that seems not to be the case.  In my career, I have encountered many upbeat, intelligent and helpful people, but I’ve also encountered plenty of people who seem to react to everything with scorn, derision, or anger.  If you know more than other people, help them.  If they’re making basic mistakes, help them understand.  If they’re making the same mistakes multiple times, help them find a way to remember.  Sighing audibly, rolling your eyes, belittling people, etc, are not helpful.  It’s bad for them, and it’s bad for you.  So please, don’t do that.

Feel free to chime in with additional tips, agreements, disagreements, etc.