Refactoring is a Development Technique, Not a Project
Editorial note: I originally wrote this post for the NDepend blog. Go check it out over there and stay for a while to look around. If you like posts about code analysis, metrics, and the business of software development, you’ll like the collection of posts there.
One of the more puzzling misconceptions that I hear pertains to the topic of refactoring. I consult on a lot of legacy rescue efforts that will need to involve refactoring, and people in and around those efforts tend to think of “refactor” as “massive cleanup effort.” I suspect this is one of those conflations that happens subconsciously. If you actually asked some of these folks whether “refactor” and “massive cleanup effort” were synonyms, they would say no, but they never conceive of the terms in any other way during their day to day activities.
Let’s be clear. Here is the actual definition of refactoring, per wikipedia.
Code refactoring is the process of restructuring existing computer code – changing the factoring – without changing its external behavior.
Significantly, this definition mentions nothing about the scope of the effort. Refactoring is changing the code without changing the application’s behavior. This means the following would be examples of refactoring, provided they changed nothing about the way the system interacted with external forces.
- Renaming variables in a single method.
- Adding whitespace to a class for readability.
- Eliminating dead code.
- Deleting code that has been commented out.
- Breaking a large method apart into a few smaller ones.
I deliberately picked the examples above because they should be semi-understandable, even by non technical folks, and because they’re all scalable down to the tiny. Some of these activities could be done by a developer in under a minute. These are simple, low-effort refactorings.
Let’s now consider another definition of refactoring that can be found at Martin Fowler’s website.
Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which “too small to be worth doing”. However the cumulative effect of each of these transformations is quite significant.
I took the wikipedia definition and used it to suggest that refactorings could be small and low-effort. Fowler takes it a step further and suggests that they should be small and low effort. In fact, he suggests that they should be “too small to be worth doing.” That’s fascinating.
The All or Nothing Fallacy
Why do you think Martin Fowler would say this? It’s a paradoxical statement — encouraging people to do things that are not worth doing. And yet, there’s a powerful message being relayed here: “refactor, even when you think it’s not worthwhile, because refactoring toward clean code is a war of attrition.”
What Martin Fowler is rebelling against with this phrasing is what I’m calling the all or nothing fallacy here. And that fallacy is what causes the conflation between “refactoring” and “massive cleanup.” It’s the idea that cleaning up a code base happens in massive batches or it doesn’t happen at all.
Refactoring as House Cleaning
Think of the approach most take to having a clean house. The house naturally tends toward dirtiness unless counter-measures are taken, so there needs to be a “clean house plan,” if you will. Some people maintain a strict and rigid set of policies to keep the house clean: no shoes in the house, food only in the kitchen, etc. If you stick to this discipline, the house will remain clean. There’s never any massive effort occurring, but the house stays clean through a constant series of “micro-efforts.”
On the opposite end of the spectrum, consider a house that’s utterly disgusting. The inhabitants eat food anywhere and throw the garbage on the floor when they’re done. No one ever dusts, vacuums, scrubs, or cleans in any way. People walk into the house with muddy shoes without a second thought.
Now imagine that you’re brought in to help turn the disgusting house into the spotless house. You can’t simply declare the disgusting house condemned and move the inhabitants — you have to stay with the same house and the same people. Naturally, you call for some pretty extensive cleanup. But that cleanup never happens.
The reason it never happens is because the slobs, for all of their poor house maintenance, have lives and jobs. They have things to do. They can’t simply take weeks off of work to declutter the house, prepare for carpet steam cleanings, re-grout the bathroom tile, etc. You call for massive cleaning and they look at you, saying, “sure, that’d be nice, but we live in the real world.” If you suggest that they do little things to help, they fatalistically ask, “what’s the point?”
Expectations for Cleanup
Martin Fowler is saying to these people, “spend 10 minutes a day doing cleanups that are too small to matter. Do it every day, like clockwork, and, after enough time, you’ll start to see results that actually matter.”
He’s right about the house, and he’s right about code. Refactoring is not a 3 month break that you take from delivering anything of value, so that you can “clean up.” Refactoring is a series of tiny decisions that you make in your code base that add up to big value over the long haul. It’s the equivalent of spending 40 years putting your spare change into a piggy bank that goes into a retirement account.
So when I’m asked by managers and CIOs whether they should stop work for a while and spend a bunch of money on a cleanup, my answer often surprises them: “of course not.” “Your developers,” I tell them, “need to learn to keep the house clean on a daily basis before a massive cleanup can be sustainable.”
This is a really great analogy that you’ve made here.
Thanks!
I think of refactoring more as restructuring, i.e. changing how the code works; the design of the code (as per the Wikipedia definition). I do not consider it just tidying up, or just changing the code as you say. When tidying up, yes little bits can be done here and there, like changing the spacing, changing variable names, but to my mind that is not refactoring. Refactoring requires planning and thought, tidying does not.
and thought it requires …
For tidying and small refactoring I think this is an excellent analogy you’ve made.
But when you run into the limitations of the current architecture or design of the code you’re dealing with you don’t get away with a 10 minutes cleanup job. Sometimes you find yourself not cleaning up the mess in the living but moving the kitchen out of the living to a separate room.
Without a concrete codebase for us to look at, it’s hard to get too specific, but I tend to find that the sorts of large efforts you mention can be broken down into small chunks and undertaken in the same fashion. For instance, if you have a lot of singletons and global state, you can win a war of attrition by gradually making client classes take an interfaced wrapper in their constructor. Eventually, when this has been done everywhere, you can take the next step of factoring away from global state. This is a simple example, but I think it… Read more »
If you have global state and singletons (singleton means 1) you have bad design, this is brought about by bad programming mainly due to bad tools like JAVA, somewhat to a badly designed house always fixing problems or creating a mess and adding to it.
There are no such thing as design patterns, refactoring etc using languages like smalltalk, lisp, erlang etc.because the design either works or it does not.If it is a bad design throw it out and start again – not refactor.
I don’t encounter a lot of clients asking me for advice on legacy systems that would heed the advice, “throw out the last 10 years of work because you did it in Java.”
Most of my later career has been to clean up these sort of messes left behind, mostly large enterprise applications and services. Read any job ads and you will see stuff like. Job Description: This position requires developer with in depth technical expertise WebSphere technologies Job responsibilities · Provides technical expertise, documentation and direction in support of Web Services Systems Level One Support, Standard Operating and Troubleshooting activities: JVM Restarts ……………….etc or Needed expert in java threading to clean up the mess left by last developers. or Needed NoSql expert to make our mess of a database work. Refactoring will… Read more »
Peter, your comparison reminds me of this: https://twitter.com/akiva/status/476469927998537728
Also this article ignores the overhead of testing. No matter how small the change, the codebase should go through testing, both automated and manual. This is sometimes a large effort in itself – and therefore making “a small change” in development isn’t allowed (or worth the testing effort).
Whether large or small, a ‘refactoring’ without automated test coverage (or some intense manual scheme) is just a change with possible regressions. I’m not ‘ignoring’ testing — I’m considering it table stakes for this discussion.
I think the testing cost is actually a key reason why doing small bits of refactoring along the way of normal development is a better approach. If you are doing a feature for which testing is already allocated/expected then the bit by bit refactoring changes get worked through that existing cost/test cycle. Where as if you try to carve out a large chunk of refactoring change then you must carve out specific cost/test time for it as a standalone item….very costly. Sure some change is bigger than others…but that is not at the heart of what this post aims to… Read more »
Hmmm… no comments on “breaking” things? I suppose if all the work was done by “perfect” programmers that never, ever make any mistakes.
I have been programming since 1977 and still make mistakes, even when I try hard.
In those 38 years I think I have made every mistake possible 3 times over!
Dumb me, I guess!
I’m sorry, but I’m not sure I understand… is this in reply to another comment here?
So a plumber comes to the house and is asked to “re-factor” a little as he does his normal plumbing stuff. The plumber says “I am on the plumbing team– I do not know much about the rest of your house and any ‘re-factor’ that I do could potentially increase the scope of my plumbing (read ‘screw stuff up’).” He then explains exactly your points in your article (great btw) but then adds this: The goal of the plumber is not to re-factor when he makes a call — it is to quickly fix the leak and keep the house… Read more »
This is a good point. While I think, philosophically, that anyone touching the code should be empowered (and even obligated) to perform “boy scout” refactorings, not all stakeholders in the code base are created equal. There are certainly situations where the best outcome for the business is achieved by a developer holding his nose, adding one more case to the switch statement, and touching absolutely nothing else.
I’m not sure the plumber analogy works here. Is there a leak (broken code) or is there a need for cleanup (cleanup bad soldering, tighten clamps etc)?
To beat a dead horse with a plumber… (20 minutes and 3 paragraphs later he deletes it all). I am just trying to say that when Erik makes the point “… cleaning up a code base happens in massive batches or it doesn’t happen at all” it is because the goals of the various stakeholders are sometimes different. Some have a short range goal of getting a feature out. Some have an URGENT goal of fixing a production ticket. And still others have a goal of keeping production environments and apps stable. I think “refactoring as a development technique” can… Read more »
Another good use of the plumber analogy might be if a plumber were to repeatedly ask for your permission to add some more weld to a pipe joint of some sort. The client’s reply might be, “No, just make the water work, and apply your expertise to ensure it continues working!”
So if all developers do the daily cleanup, no major rework is needed. But what if that is not the case, or, a mistake was made, resulting in poor design choice that needs to be addressed. Are you saying that one is not allowed to call the effort to fix the faulty design a refactoring? Or that one is supposed to ignore the problem by saying “of course not”? Most of software houses I know are neither tidy, not messy – they are in between and your metaphor doesn’t describe it well.
I’m not advocating that teams not undertake sizable changes to the structure of the code. I’m advocating that teams not put the brakes on delivering value in order to do so (e.g. “we’re not doing any more features until we spend the next 2 months ‘cleaning up’ this code.”)
But you did limit such changes to “small behavior-preserving transformations”. That doesn’t address occasional need to rework critical infrastructure parts (what you call ‘cleaning up’), that may, in extreme cases, require multiple months of work. While the rest of the team should be delivering features, whomever is doing these sizable changes (to whom you refer as ‘we’) won’t be delivering features all that time. Nothing says that agile process, like scrum, should not be flexible (which implies that sprint masters don’t overuse word ‘No’). My problem is not with your definition of refactoring, which is a good one, but with… Read more »
This is very good article, thanks for sharing. I would like to ask that should I refactor for just my code or also the other’s code.
My personal opinion is that teams should be comfortable refactoring both their code and the code of others. This is in line with the Extreme Programming (XP) principle of “collective code ownership.” The idea is that the team, and not individuals, owns the code.
Thank you for the second time.
Refactoring is reworking bad design and coding.
If you get to the refactoring stage, it is the same as slobs who have not kept their house tidy and ended up with a mess.
Successful Software Development needs good design and coding and refactoring should never enters the equation.
Design is a hard problem. You typically don’t know the best design until it’s practically staring you in the face. If you think all the code you write is as optimal as it could the first time you write, you are either correct, and a veritable genius, or incorrect, and extremely delusional.
Glad to hear sane voice in a *simple* world.
Design is not a hard problem, it is an art, either you are good or not.There is plenty of software out there that is well designed and running for years with no problems. (take the Apple II designed by Wozniac, with no known bugs in hardware or software). there is far more that is bad. Most developers I have worked with are pretty bad at their jobs, to them it is a career, just assembly-line developers. Design patterns and best practice come to mind, meaning we don’t know what we are doing so just use best practice. In short the… Read more »
Mmmmm… No. Programming is not art, it’s engineering. There are right and wrong answers. Arriving at these answers requires information, facts, data. Often times, not all the data is available at the outset of a project. This is precisely why people develop prototypes.
Also, I agree with the rest of what you said, though it wasn’t particularly relevant to the conversation.
Design it definitely an art.
Programming is more like craftsmanship, based on experience, skills and talent, as with any craftsmanship, everyone is not necessarily gifted with the same skills nor do they possess the ability to do the same things.
You need to know how to do things, and in some cases you need to do things [like reviewing requirements, developing and testing] in a sequence. You learn the best way to develop as you go,
_Graphic_ design is art. _Software_ design is engineering.
If the customer pays for a house, he won’t care about whether it’s tidy inside or not. I’ve seen more projects failing due to this academic approach than due to a code that was less than ideal.
Every code is less than ideal, and maintenance is expensive, almost independant from the implementation of an adaptive design (which might be considered to be bad in just a few years).
In a dream world, yes. In real world “successful software development” is practiced by development teams following good practices, teams that are self-correcting in reasonable time after ‘bad’ design choices are inevitably made. The problem with any *modern* software ideas, like OOD and re-factoring, is that people who begin to understand their strengths ignore their limitations, then make things worse by pretending that their view is the only one, that works; then sell it as such to managers with deep pockets. Add to that the fact that managers with deep pockets are busy people, who prefer simple reasoning more than… Read more »
Firstly I like the main sentiment of this article, tidy as you go, but I think the responses illustrate that “refactoring” is yet another trendy pigeonhole term. Nobody can agree the box labels and not every situation fits in a box so just forget adhering to an imperfect concept and do whatever needs to be done to make the code maintainable and easily to change within the cost/effectiveness curve. We developers are all immensely talented and intelligent after all, so it shouldn’t be hard, no?
if you leave code unattended it does not get dirtier as a house does. But if you write dirty code in the first place, yes you should repeat the work until you do it right. So you better start writing good code since the first attempt instead of planning to write code twice. That way, unless your language does some improvements you need to address, your code will work as intended in the very beggining