Why You Should Do Periodic Reviews of Legacy Code
Editorial Note: I originally wrote this post for the SmartBear blog. Go take a look at the original here, at their site. If you like posts about collaboration, code review, and other topics, take a look around while you’re there.
Legacy code is sort of like your house’s storage crawlspace. It tends to be a repository for things that mattered to you in days past or on special occasions. The code sits there, largely unnoticed, until such time as an odd change or a production bug causes you to dig it up, dust it off, and revisit it. Barring extraordinary circumstances, it tends to sit, largely forgotten, and possibly rotting or getting riddled with moth holes.
By and large, this considered an acceptable and even desirable state of affairs in our industry. A lot of folks that manage software development efforts and hold the purse strings think of software construction the way they think of building construction. Once you’ve built a house, it’s done. Why would you go back and revisit it unless there was some kind of problem that had cropped up? The problem with this well-intentioned, bottom-line thinking is that building software isn’t much like building houses.
When you build software, you stack the new atop the old and everything comes along for the ride. Sure, there may be the occasional new module that stands all by itself or plugs in with the rest, but that’s the exception. The new code interacts with the old stuff, calling into it, relying on it, and running beside it in production. If housing construction worked this way, a short circuit in the house across the street might cause your shower to stop working.
The result is that, however well-intentioned someone encouraging you not to focus on legacy code might be, the edict is often misguided. If the short circuit across the street meant you couldn’t shower, would you listen to someone who told you their wiring was none of your business? Clearly, you wouldn’t go storming over there out of nowhere, remodeling that house, but you wouldn’t just ignore it either.
This is the approach required with legacy code in your code base. The fact that someone typed it out years ago doesn’t mean that it doesn’t have a very real, current impact on your team every time you deploy your code. Because of this, it behooves you to review it occasionally, when time permits.
Let’s examine some specifics as to why it makes sense to audit your legacy code from time to time. Having your finger on the pulse of everything going into production is a compelling but abstract argument. So, let’s get more concrete.
Performance
Have you ever written a while loop that you knew wasn’t very efficient? Of course, it’s not that you were being lazy. Rather, you knew it wasn’t worth optimizing because you’d only ever loop 10 times or so, even in the worst case scenario. Optimizing it simply wasn’t worth while.
Now, imagine that you made this decision back in 2009. Or imagine that someone else did. Is it possible that you might revisit this code only to discover that your “only ever 10 times” assumption is woefully obsolete, and that bit of processing is now happening 10 thousand times?
Over the long haul, even sub-optimal situations like this simply become part of your application’s performance profile, and nobody is necessarily screaming about them. But a quick review of this code might allow you to realize some pretty serious performance gains, especially if there’s some big, fat comment at the top that says, “we should really revisit this if we ever loop through here more than 10 times.”
Consistency
I look at a lot of code bases — enough to notice a rather amusing pattern. Many old code bases in more relaxed shops take on the feel of archaeological digs, where the epochs of code are peeled back in discrete layers. “Oh, this part of the code base was from 2007, when someone discovered the model-view-presenter pattern.” “Oh, and take a look here — this is the 2010 section of the code base when they got really into using aspect oriented programming.”
If you have a code base like this, it’s really, really hard for new team members to infer expected practice from their surroundings. Long tenured team members know the old layers of code by familiarity, but for anyone else, it’s like walking through a house where each room is a different color and trying to figure out what color the owner wants the kitchen painted.
Reviewing the code periodically will mute this phenomenon somewhat through pure observer effect. But, it can also help you identify the most confusing and disparate examples and take tactical opportunities to get consistent.
Defect Prevention
As a code base sprawls over the years and developers come and go, there’s a natural tendency for wheels to be reinvented. This is pretty understandable; sooner or later there’s just so much code that you can’t possibly have a handle on what all of it does. But the result is well known enough that it is mentioned on a list of code smells: “oddball solution.”
Having poorly duplicated code strewn throughout your code base is a recipe for unpredictable defects. A change in requirements might result in a deliberate fix in one version of the solution, with others limping along, quietly being wrong for months. Periodic reviews of old code will tend to mitigate this, and also provide understanding of risk spots.
Learning
When talking about legacy code, there’s always an implied vibe that everyone laboring away in the past was doing sloppy, wrong things. It’s just a natural offshoot of the human tendency to blame all problems in the group on people that have departed. But maybe, just maybe, it’s possible that someone writing code a while ago did something right or even worth emulating.
The reasons for reviewing legacy code aren’t limited to negatives. Taking a look back through old code from time to time has the potential to reveal good techniques, effective patterns, and general practices worth emulating. You might actually learn something from the people who have been at this a while.
Every Bit Helps
I’d like to be clear here. I am not recommending that you bring the business surrounding your team to its knees while you spend weeks pouring through every nook and cranny of your team’s source control history. I’m not a lunatic.
All I’m advocating for is that you take a bit of time every now and then to take stock of your team’s older, less frequently visited, code. You might find opportunities to improve your application’s behavior or to standardize your code. You might help catch defects before anyone notices, and you might even learn a thing or two that you can use in your own code. But even if none of that happens on any given review, you’ll understand your group’s code better than you did. You’ll understand what you have in storage.
It’s also surprising sometimes when I review my own old code and ask, “wow, did I really write that?!”. It’s usually a mix of good and bad code though, so definitely learning in some way or other.
Heh, I definitely know what you mean. It can be quite surreal to look at your own code from years ago.
Generally, I like your idea. But who in hell has time to do something like this?
While I’ve encountered no shortage of shops where 60 hours per week still wasn’t enough, there are a good number out there where developers have a lot of idle time. It’s the Goldilocks zone of busy-ness that seems to be most elusive.
In advising orgs to do this, I’d suggest making it an incremental activity on top of existing review, timeboxed or else done only periodically.
If it’s not broken, fix it until it is.
I like the turn of phrase in general, but I’d argue that legacy code is always broken to some degree or another. If the status quo is fine with users, it’s because they’re used to their workarounds for its warts and bugs.
Probably most code is broken to some degree or another. To throw in another aphorism, “There is no tool perfect for many tasks.”
A lot of truth in both statements.
>a natural offshoot of the human tendency to blame all problems in the group on people that have departed
Not all problems… just specific ones like the one in the old code I am fixing right now. If I could just get my hands around the neck of the brainless hack that –**?? wait… that is my old code. Never mind. Legacy code is for learning for sure. 😉
🙂 I’ve had that experience more times than I care to recount.