How to Keep Method Size Under Control
Do you ever open a source code file and see a method that starts at the top of your screen and kind of oozes its way to the bottom with no end in sight? When you find yourself in that situation, imagine that you’re reading a ticker tape and try to guess at where the method actually ends. Is it a foot below the monitor? Three feet? Does it plummet through the floor and into the basement, perhaps down past the water table and into the earth’s mantle?
Visualized like this, I think everyone might agree that there’s some point at which the drop is too far, though there’s likely some disagreement on where exactly this is. Personally, I used to subscribe to the “fits on a screen” heuristic and would only start looking to pull out methods if it got beyond that. But in more recent years, I think even smaller. How small? I dunno–five or six lines, max. Small enough that you’ll only ever see one try-catch or control flow statement in there. Yeah, seriously, that small. If you’re thinking it sounds kind of crazy, I get that, but give it a try for a while. I can almost guarantee that you’ll lose your patience for looking at methods that cause you to think, “wait, where was loopCounter declared again–before the second or third while loop?”
If you accept the premise that this is a good way to do things or that it might at least be worth a try, the first thing you’ll probably wonder is how to go about doing this from a practical standpoint. I’ve definitely encountered people and even whole groups who considered method sizes like this to be impractical. The first thing you have to do is let go of the notion that classes are in some kind of limited supply and you have to be careful not to use too many. Same with modules, if your project gets big enough. The reason I say this is that having small methods means that you’re going to have a lot of them. This in turn means that they’re going to need to be spread to multiple classes, and those classes will occupy more namespaces and modules. But that’s okay. If you encounter a large application that’s well designed and factored, it’s that way because the application is actually a series of small, focused components working together. Monolithic doesn’t scale well.
Getting Down to Business
If you’ve prepared yourself for the reality of needing more classes organized into more namespaces and modules, you’ve really overcome the biggest obstacle to being a small-method coder. Now it’s just a question of mechanics and practice. And this is actually important–it’s not sufficient to just say, “I’m going to write a lot of methods by stopping at the fifth line, no matter what.” I guarantee you that this is going to create a lot of weird cross-coupling, unnecessary state, and ugly things like out parameters. Nobody wants that. So it’s time to look to the art of creating abstractions.
As a brief digression, I’ve recently picked up a copy of Uncle Bob Martin’s Clean Code: A Handbook of Agile Software Craftsmanship and been tearing my way through it pretty quickly. I’d already seen most of the Clean Coder video series, which covers some similar ground, but the book is both a good review and a source of new and different information. To be blunt, if you’re ever going to invest thirty or forty bucks in getting better at your craft, this is the thing to buy. It’s opinionated, sometimes controversial, incredibly specific, and absolute mandatory reading. It will change your outlook on writing code and make you better at what you do, even if you don’t agree with every single point in it (though I don’t find much with which to take issue, personally).
The reason I mention this book and series is that there is an entire section in the book about functions/methods, and two of its fundamental points are that (1) functions should do one thing and one thing only, and (2) that functions should have one level of abstraction. To keep those methods under control, this is a great place to start. I’d like to dive a little deeper, however, because “do one thing” and “one level of abstraction per function” are general instructions. It may seem a bit like hand-waving without examples and more concrete heuristics.
Extract Finer-Grained Details
What Uncle Bob is saying about mixed abstractions can be demonstrated in this code snippet:
public void OpenTheDoor()
{
GrabTheDoorKnob();
TwistTheDoorKnob();
TightenYourBiceps();
BendYourElbow();
KeepYourForearmStraight();
}
Do you see what the issue is? We have a method here that describes (via sub-methods that are not pictured) how to open a door. The first two calls talk in terms of actions between you and the door, but the next three calls suddenly dive into the specifics of how to pull the door open in terms of actions taken by your muscles, joints, tendons, etc. These are two different layers of abstractions: one about a person interacting with his or her surroundings and the other detailing the mechanics of body movement. To make it consistent, we could get more detailed in the first two actions in terms of extending arms and tightening fingers. But we’re trying to keep methods small and focused, so what we really want is to do this:
public void OpenTheDoor()
{
GrabTheDoorKnob();
TwistTheDoorKnob();
PullOpenTheDoor();
}
private static void PullOpenTheDoor()
{
TightenYourBiceps();
BendYourElbow();
KeepYourForeArmStraight();
}
Create Coarser Grained Categories
What about a different problem? Let’s say that you have a method that’s long, but it isn’t because you are mixing abstraction levels:
public void CookQuesadilla()
{
ChopOninons();
ShredCheese();
GetOutThePan();
AddOilToPan();
TurnOnTheStove();
SprinkleOnionsAndCheeseOnTortilla();
PutTortillaInPan();
CookUntilFirm();
FoldTortillaAndCookUntilBrown();
FlipTortillaAndCookUntilBrown();
RemoveCookedQuesadilla();
RemovePanFromStove();
ScrubPanWithBrush();
ServeQuesadillas();
}
These items are all at the same level of abstraction, but there are an awful lot of them. In the previous example, we were able to tighten up the method by making the abstraction levels consistent, but here we’re going to actually need to add a layer of abstraction. This winds up looking a little better:
public void CookQuesadilla()
{
PrepareIngredients();
PrepareEquipment();
PerformActualCooking();
FinishUp();
}
private static void PrepareIngredients()
{
ChopOninons();
ShredCheese();
}
private static void PrepareEquipment()
{
GetOutThePan();
AddOilToPan();
TurnOnTheStove();
}
private static void PerformActualCooking()
{
SprinkleOnionsAndCheeseOnTortilla();
PutTortillaInPan();
CookUntilFirm();
FoldTortillaAndCookUntilBrown();
FlipTortillaAndCookUntilBrown();
RemoveCookedQuesadilla();
}
private static void FinishUp()
{
RemovePanFromStove();
ScrubPanWithBrush();
ServeQuesadillas();
}
In essence, we’ve created categories and put the actions from the long method into them. What we’ve really done here is create (or add to) a tree-like structure of methods. The public method is the root, and it had thirteen children. We gave it instead four children, and each of those children has between two and five children of its own. To tighten up methods, it’s perfectly viable to add “nodes” to the “tree” of your call stack. While “do one thing” is still a little elusive, this seems to be carrying us in that direction. There’s no individual method that you look at and think, “boy, that’s a lot of stuff going on.” Certainly its a matter of some art and taste, but this is probably a good way to think of it–organize stuff into hierarchical method categories until you look at each method and think, “I could probably memorize what that does if I needed to.”
Recognize that Control Flow Uses Up an Abstraction
So far we’ve been conceptually figuring out how to organize families of methods into well-balanced tree structures, and that’s taken us through some pretty mundane code. This code has involved none of the usual stuff that sends apps careening off the rails into bug land, such as conditionals, loops, assignment, etc. Let’s correct that. Looking at the code above, think of how you’d modify this to allow for the preparation of an arbitrary number of quesadillas. Would it be this?
public void CookQuesadillas(int numberOfQuesadillas)
{
PrepareIngredients();
PrepareEquipment();
for(int i = 0; i < numberOfQuesadillas; i++)
PerformActualCooking();
FinishUp();
}
Well, that makes sense, right? Just like the last version, this is something you could read conversationally while in the kitchen just as easily as you do in the code. Prep your ingredients, then prep your equipment, then for some integer index equal to zero and less than the number of quesadillas you want to cook, increment the integer by one. Each time you do that, cook the quesadilla. Oh, wait. I think we just went careening into the nerdiest kitchen narrative ever. If Gordon Ramsey were in charge, he'd have strangled you with your apron for that. Hmm... how 'bout this?
public void CookQuesadillas(int numberOfQuesadillas)
{
PrepareIngredients();
PrepareEquipment();
PerformActualCooking(numberOfQuesadillas);
FinishUp();
}
private static void PerformActualCooking(int numberOfQuesadillas)
{
for (int index = 0; index < numberOfQuesadillas; index++)
{
SprinkleOnionsAndCheeseOnTortilla();
PutTortillaInPan();
CookUntilFirm();
FoldTortillaAndCookUntilBrown();
FlipTortillaAndCookUntilBrown();
RemoveCookedQuesadilla();
}
}
Well, I'd say that the CookQuesadillas method looks a lot better, but do we like "PerformActualCooking?" The whole situation is an improvement, but I'm not a huge fan, personally. I'm still mixing control flow with a series of domain concepts. PerformActualCooking is still both a story about for-loops and about cooking. Let's try something else:
public void CookQuesadillas(int numberOfQuesadillas)
{
PrepareIngredients();
PrepareEquipment();
PerformActualCooking(numberOfQuesadillas);
FinishUp();
}
private static void PerformActualCooking(int numberOfQuesadillas)
{
for (int index = 0; index < numberOfQuesadillas; index++)
CookAQuesadilla();
}
private static void CookAQuesadilla()
{
SprinkleOnionsAndCheeseOnTortilla();
PutTortillaInPan();
CookUntilFirm();
FoldTortillaAndCookUntilBrown();
FlipTortillaAndCookUntilBrown();
RemoveCookedQuesadilla();
}
We've added a node to the tree that some might say is one too many, but I disagree. What I like is the fact that we have two methods that contain nothing but abstractions about the domain knowledge of cooking and we have a bridging method that brings in the detailed realities of the programming language. We're isolating things like looping, counting, conditionals, etc. from the actual problem solving and story telling that we want to do here. So when you have a method that does a few things and you think about adding some kind of control flow to it, remember that you're introducing a detail to the method that is at a lower level of abstraction and should probably have its own node in the tree.
Adrift in a Sea of Tiny Methods
If you're looking at this cooking example, it probably strikes you that there are no fewer than eighteen methods in this class, not counting any additional sub-methods or elided properties (which are really just methods in C# anyway). That's a lot for a class, and you may think that I'm encouraging you to write classes with dozens of methods. That isn't the case. So far what we've done is started to create trees of many small methods with a public method and then a ton of private methods, which is a code smell called "Iceberg Class." What's the cure for iceberg classes? Extracting classes from them. Maybe you turn the first two methods that prepare ingredients and equipment into a "Preparer" class with two public methods, "PrepareIngredients" and "PrepareEquipment." Or maybe you extract a quesadilla cooking class.
It's really going to vary based on your situation, but the point is that you take this opportunity pick nodes in your growing tree of methods and sub-methods and convert them into roots by turning them into classes. And if doing this leads you to having what seems to be too many classes in your namespace? Create more namespaces. Too many of those in a module? Create more modules. Too many modules/projects in a solution? More solutions.
Here's the thing: the complexity exists no matter how many or few methods/classes/namespaces/modules/solutions you have. Slamming them all into monolithic constructs together doesn't eliminate or even hide that complexity, though many seem to take the ostrich approach and pretend that it does. Your code isn't somehow 'simpler' because you have one solution with one project that has ten classes, each with 300 methods of 7,000 lines. Sure, things look simple when you fire up the IDE, but they sure won't be simple when you try to debug. In fact, they'll be much more complicated because your functionality will be hopelessly interwoven with weird temporal couplings, ad-hoc references, and hidden dependencies.
If you create large trees of functionality, you have the luxury of making the structure of the tree the representative of the application's complexity, with each node an island of simplicity. It is in these node-methods that the business logic takes place and that getting things right is most important. And by managing your abstractions, you keep these nodes easy to reason about. If you structure the tree correctly and follow good OOP design and practice, you'll find that even the structure of the tree is not especially complicated since each node provides a good representative abstraction for its sub-tree.
Having small, readable, self-documenting methods is no pipe dream. Really, with a bit of practice, it's not even very hard. It just requires you to see code a little bit differently. See it as a series of hierarchical stories and abstractions rather than as a bunch of loops, counters, pointers, and control flow statements, and the people that maintain what you write, including yourself, will thank you for it.
Great post, Erik. As you’ve noted, organizing code this way aids understanding, testing, maintaining, and refactoring. I’ve even used it as a low-level design technique – outlining the flow via method stubs, then filling in the blanks.
I’ve done the same from time to time. I usually find that this sort of skeleton-ing is good for helping me decide what goes in which class.
Good write-up. In my experience, the problem with methods that are too long is pretty common in practice. One of the benefits of “many small methods” is that for each method you create you get to pick a name that describes what it does. If it is hard to pick a good name, maybe the method does too much. The name also works as a form of documentation. When reading the code, once I’ve read through a method like PrepareEquipment(), I trust it and know what it does. Then the method name becomes a handy short-hand so I don’t have… Read more »
The naming aspect is a great point. Hand in hand with gravitating toward smaller methods, I’ve also taken to the practice of anytime I feel like something needs a comment or explaining, I just pull out a method. Especially with conditionals. As soon as I’m doing something like if(isReady && x == 5 && y = 9) I now immediately think “why don’t I pull that into a method so that a maintenance programmer has a chance of understanding what it is that means.
I finally got around to blogging about my take on the use of short methods, “7 Ways More Methods Can Improve Your Program” http://henrikwarne.com/2013/08/31/7-ways-more-methods-can-improve-your-program/
I like that post — especially the bullet about grouping like functionality together. I think that one of the real problems with sprawling methods tends to be that functionality gets spread around haphazardly with the components of a given action being strewn all over the method.
While I agree that you should break down code into methods and that you get closer to the goal of maintainability and readability, the approach is still not the best. All the above is still highly procedural programming, you might not have one huge method, but still refactoring the above is complicated… Examples, what if out of the sudden, the tortilla shouldn’t be warmed up until firm and it’s a flour tortilla that needs to be warm, but if it’s a corn tortilla it should be firm? In short words, it should be less procedural programming and more OO using… Read more »
Heh, yeah, that code is definitely “command and control” and procedural as it gets. The reason for this is that I wanted to create a very focused and contrived example to focus on the subject of abstractions, and the easiest thing to reason about is a method with no parameters and no return values (especially when you can’t see what it’s actually doing). I think in this narrow context focusing on inverting control and polymorphism would confuse the issue. If I were writing this out into a book or a series of posts, what I’d probably do is go through… Read more »
I’m actually going to argue for this style of coding is appropraite for the complexity of the example (I’d hate to see it go much bigger though). I see a lot of people that are highly motivated to make everything object oriented, and I consider that a fault. This code is great the way it is at the time it was written. It works on one type of tortilla very well, and it’s easy to read. Implementing a strategy pattern at this stage in the game would only add unneccasary places where the code could break, and would be a… Read more »
I found your blog through reddit and I love the way you present ideas. Please keep writing such awesome posts. You’re a godsend for a still-learning programmer like me.
Thanks for the kind words! I’m glad you enjoy the posts, and I’m happy if they help. I’m pretty much always happy to go on about code, so they’ll definitely keep coming 🙂
If your function/method body is too long, I find it a good idea to compress to base64. The pre-processor when must uncompress it before compilation (which takes a little bit of time). You get the best of all worlds – long methods in a compact form.
Oh my… :0