I would like to preface this post by saying that I don’t like the Singleton design pattern at all. There are some who love it, others who view it as a sometimes-useful, often-abused tool in the toolbox, and then there are people like me who believe that it’s extremely rare to see a use of it that isn’t an abuse — so rare that I consider it better to avoid the pattern altogether. I won’t go into much more detail on my thoughts on the Singleton (stay tuned for that when I get to this pattern in my design patterns series), but here is some further reading that expounds a lot on reasons not to like this pattern:
With all of those arguments against Singleton in mind, and considering the damage that abuse (and I would argue use) of the pattern causes in code bases, I found myself thinking of code bases I’ve heard of or encountered where the code was littered with Singletons. In these code bases, refactoring becomes daunting for a variety of reasons: the hidden dependencies, the complex temporal dependencies, the sheer volume of references to the Singleton instances, the almost obligatory Law of Demeter violations, etc. Singletons cause/encourage a rich variety of problems in code bases. But I think that something could be done about two such problems: the redundant Singleton implementation logic and the Single Responsibility Principle (SRP) violation of which Singleton classes are prima facie guilty.
public class Singleton
{
private static Singleton instance;
private Singleton() {}
public static Singleton Instance
{
get
{
if (instance == null)
instance = new Singleton();
return instance;
}
}
}
More specific even to C# is James Michael Hare’s implementation using the Lazy<T> class to eliminate the redundant if-instance-null-instantiate logic needed in each Singleton class. But that still leaves the redundant Instance property, the redundant _instance field, and the awkwardness (SRP violation) of an object managing its own cardinality. What if we got rid of both of those issues?
public static class Singleton where T : new()
{
private static readonly Lazy _instance = new Lazy();
public static T Instance { get { return _instance.Value; } }
}
public static class SingletonClient
{
public void Demonstrate()
{
Singleton.Instance.Write("Look!! Logger isn't a Singleton -- it's a real class!!");
}
}
Using generics, this implementation allows an instance of every Singleton to be expressed in the code of a single class. In this fashion, any type can be implemented as a Singleton without the type being coupled to an internal Singleton implementation. It also standardizes the Singleton implementation by making sure it exists only once in the code base. With this pattern, there is a similar feel to IoC containers — classes can be implemented without concern for who will instantiate them and how.
Here is my take on the disadvantages and potential disadvantages of this approach:
You still have Singletons in your code base.
This might actually encourage more Singleton usage (though fewer actual Singleton implementations) by removing responsibility for implementation.
Since constructor of object in question must have public default constructor, removes the Singleton gimmick of making the constructor private and thus the safeguard against additional instances being created is also removed.
A natural extension of the previous item is that it remains a matter of documentation or convention to use Singleton<T> and not instantiate rather than it being impossible to instantiate.
But the advantages that I see:
Only one class implements cardinality management, meaning refactoring away from Singleton is nominally easier.
No Singleton SRP violations
Singleton implementation (initialization, lazy load, thread-safety) is standardized to prevent inconsistent approach.
I actually consider preventing the private constructor gimmick to be a plus — particularly where potential refactoring may be needed (and it pretty much always is when Singletons exist).
With this pattern in your code base developers are less likely to be come comfortable/familiar with implementing the Singleton pattern.
No need to introduce interfaces for Singletons to inject instance into classes using dependency injection — just inject a non-Singleton managed instance.
A code base with nothing but normal instances is very easy to reason about and test, so the closer you get to that, the better.
I would like to reiterate that I’m not suggesting that you run out and start doing this everywhere. It mitigates some of the natural problems with the Singleton design pattern but “mitigates some” is a long way from “fixes all”. But, if you have a code base littered with Singletons this could potentially be a nice intermediate refactoring step to at least standardize the singleton implementation and provide slightly more friendly seams to testing legacy code. Or if you’re faced with and outvoted by a group of other developers that love their global variables (er, excuse me, Singletons), this might be a decent compromise to limit the damage they cause to the code’s maintainability. On balance, I’d say that this is an interesting tool to add to your arsenal, but with the caveat that something is probably wrong if you find that you have a use for it.
I’d be curious to hear others’ takes on this as well. I just kind of dreamed this up off the top of my head without taking a lot of time to explore all potential ramifications of it use. If you’ve implemented something like this before or if you have opinions on the advantages/disadvantages that are different than mine or if you see ways it could be improved, I’d be interested to hear about it in the comments.
When I was a freshman in college, I took a class called “Great Theoretical Ideas In Computer Science”, taught by Professor Steven Rudich. It was in this class that I began to understand how fundamentally interwoven math and computer science are and to think of programming as a logical extension of math. We learned about concepts related to Game Theory, Cryptography, Logic Gates and P, NP, NP-Hard and NP-Complete problems. It is this last set that inspires today’s post.
This is not one of my “Practical Math” posts, so I will leave a more detailed description of these problems for another time, but I was introduced to a fascinating concept here: that you can make powerful statements about solutions to problems without actually having a solution to the problem. The mechanism for this was an incredibly cool-sounding concept called “The Problem Solving Oracle”. In the world of P and NP, we could make statements like “If I had a problem solving oracle that could solve problem X, then I know I could solve problem Y in polynomial (order n squared, n cubed, etc) time.” Don’t worry if you don’t understand the timing and particulars of the oracle — that doesn’t matter here. The important concept is “I don’t know how to solve X, but if I had a machine that could solve it, then I know some things about the solutions to these other problems.”
It was a powerful concept that intrigued me at the time, but more with grand visions of fast factoring and solving other famous math problems and making some kind of name for myself in the field of computer science related mathematics. Obviously, you’re not reading the blog of Fields Medal winning mathematician, Erik Dietrich, so my reach might have exceeded my grasp there. However, concepts like this don’t really fall out of my head so much as kind of marinate and lie dormant, to be re-appropriated for some other use.
Magic Boxes: Oracles For Programmers
One of the most important things that we do as developers and especially as designers/architects is to manage and isolate complexity. This is done by a variety of means, many of which have impressive-sounding terms like “loosely coupled”, “inverted control”, and “layered architecture”. But at their core, all of these approaches and architectural concepts have a basic underlying purpose: breaking problems apart into isolated and tackle-able smaller problems. To think of this another way, good architecture is about saying “assume that everything else in the world does its job perfectly — what’s necessary for your little corner of the world to do the same?”
That is why the Single Responsibility Principle is one of the most oft-cited principles in software design. This principle, in a way, says “divide up your problems until they’re as small as they can be without being non-trivial”. But it also implies “and just assume that everyone else is handling their business.”
Consider this recent post by John Sonmez, where he discusses deconstructing code into algorithms and coordinators (things responsible for getting the algorithms their inputs, more or less). As an example, he takes a “Calculator” class where the algorithms of calculation are mixed up with the particulars of storage and separates these concerns. This separates the very independent calculation algorithm from the coordinating storage details (which are of no consequence to calculations anyway) in support of his point, but it also provides the more general service of dividing up the problem at hand into smaller segments that take one another granted.
Another way to think of this is that his “Calculator_Mockless” class has two (easily testable) dependencies that it can trust to do their jobs. Going back to my undergrad days, Calculator_Mockless has two Oracles: one that performs calculations and the other that stores stuff. How do these things do their work? Calculator_Mockless doesn’t know or care about that; it just provides useful progress and feedback under the assumption that they do. This is certainly reminiscent of the criteria for “Oracle”, an assumption of functionality that allows further reasoning. However, “Oracle” has sort of a theoretical connotation in the math sense that I don’t intend to convey, so I’m going to adopt a new term for this concept in programming: “Magic Box”. John’s Calculator_Mockless says “I have two magic boxes — one for performing calculations and one for storing histories — and given these boxes that do those things, here’s how I’m going to proceed.”
How Spaghetti Code Is Born
It’s one thing to recognize the construction of Magic Boxes in the code, but how do you go about building and using them? Or, better yet, go about thinking in terms of them from the get-go? It’s a fairly sophisticated and not always intuitive method for deconstructing programming problems.
To see what I mean, think of being assigned a hypothetical task to read an XML file full of names (right), remove any entries missing information, alphabetize the list by last name, and print out “First Last” with “President” pre-pended onto the string. So, for the picture here, the first line of the output should be “President Grover Cleveland”. You’ve got your assignment, now, quick, go – start picturing the code in your head!
What did you picture? What did you say to yourself? Was it something like “Well, I’d read the file in using the XDoc API and I’d probably use an IList<> implementer instead of IEnumerable<> to store these things since that makes sorting easier, and I’d probably do a foreach loop for the person in people in the document and while I was doing that write to the list I’ve created, and then it’d probably be better to check for the name attributes in advance than taking exceptions because that’d be more efficient and speaking of efficiency, it’d probably be best to append the president as I was reading them in rather than iterating through the whole loop again afterward, but then again we have to iterate through again to write them to the console since we don’t know where in the list it will fall in the first pass, but that’s fine since it’s still linear time in the file size, and…”
And slow down there, Sparky! You’re making my head hurt. Let’s back up a minute and count how many magic boxes we’ve built so far. I’m thinking zero — what do you think? Zero too? Good. So, what did we do there instead? Well, we embarked on kind of a willy-nilly, scattershot, and most importantly, procedural approach to this problem. We thought only in terms of the basic sequence of runtime operations and thus the concepts kind of all got jumbled together. We were thinking about exception handling while thinking about console output. We were thinking about file reading while thinking about sorting strings. We were thinking about runtime optimization while we were thinking about the XDocument API. We were thinking about the problem as a monolithic mass of all of its smaller components and thus about to get started laying the bedrock for some seriously weirdly coupled spaghetti code.
Cut that Spaghetti and Hide It in a Magic Box
Instead of doing that, let’s take a deep breath and consider what mini-problems exist in this little assignment. There’s the issue of reading files to find people. There’s the issue of sorting people. There’s the issue of pre-pending text onto the names of people. There’s the issue of writing people to console output. There’s the issue of modeling the people (a series of string tuples, a series of dynamic types, a series of anonymous types, a series of structs, etc?). Notice that we’re not solving anything — just stating problems. Also notice that we’re not talking at all about exception handling, O-notation or runtime optimization. We already have some problems to solve that are actually in the direct path of solving our main problem without inventing solutions to problems that no one yet has.
So what to tackle first? Well, since every problem we’ve mentioned has the word “people” in it and the “people” problem makes no mention of anything else, we probably ought to consider defining that concept first (reasoning this way will tell you what the most abstract concepts in your code base are — the ones that other things depend on while depending on nothing themselves). Let’s do that (TDD process and artifacts that I would normally use elided):
public struct Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override string ToString()
{
return string.Format("{0} {1}", FirstName, LastName);
}
}
Well, that was pretty easy. So, what’s next? Well, the context of our mini-application involves getting people objects, doing things to them, and then outputting them. Chronologically, it would make the most sense to figure out how to do the file reads at this point. But “chronologically” is another word for “procedurally” and we want to build abstractions that we assemble like building blocks rather than steps in a recipe. Another, perhaps more advantageous way would be to tackle the next simplest task or to let the business decide which functionality should come next (please note, I’m not saying that there would be anything wrong with implementing the file I/O at this point — only that the rationale “it’s what happens first sequentially in the running code” is not a good rationale).
Let’s go with the more “agile” approach and assume that the users want a stripped down, minimal set of functionality as quickly as possible. This means that you’re going to create a full skeleton of the application and do your best to avoid throw-away code. The thing of paramount importance is having something to deliver to your users, so you’re going to want to focus mainly on displaying something to the user’s medium of choice: the console. So what to do? Well, imagine that you have a magic box that generates people for you and another one that somehow gets you those people. What would you do with them? How ’bout this:
public class PersonConsoleWriter
{
public void WritePeople(IEnumerable people)
{
foreach (var person in people)
Console.Write(person);
}
}
Where does that enumeration come from? Well, that’s a problem to deal with once we’ve handled the console writing problem, which is our top priority. If we had a demo in 30 seconds, we could simply write a main that instantiated a “PersonConsoleWriter” and fed it some dummy data. Here’s a dead simple application that prints, well, nothing, but it’s functional from an output perspective:
public class PersonProcessor
{
public static void Main(string[] args)
{
var writer = new PersonConsoleWriter();
writer.WritePeople(Enumerable.Empty());
}
}
What to do next? Well, we probably want to give this thing some actual people to shove through the pipeline or our customers won’t be very impressed. We could new up some people inline, but that’s not very impressive or helpful. Instead, let’s assume that we have a magic box that will fetch people out of the ether for us. Where do they come from? Who knows, and who cares — not main’s problem. Take a look:
public class PersonStorageReader
{
public IList GetPeople()
{
throw new NotImplementedException();
}
}
Alright — that’s pretty magic box-ish. The only way it could be more so is if we just defined an interface, but that’s overkill for the moment and I’m following YAGNI. We can add an interface if thing later needs to pull customers out of a web service or something. At this point, if we had to do a demo, we could simply return the empty enumeration instead of throwing an exception or we could dummy up some people for the demo here. And the important thing to note is that now the thing that’s supposed to be generating people is the thing that’s generating the people — we just have to sort out the correct internal implementation later. Let’s take a look at main:
public static void Main(string[] args)
{
var reader = new PersonStorageReader();
var people = reader.GetPeople();
var writer = new PersonConsoleWriter();
writer.WritePeople(people);
}
Well, that’s starting to look pretty good. We get people from the people reader and feed them to the people console writer. At this point, it becomes pretty trivial to add sorting:
public static void Main(string[] args)
{
var reader = new PersonStorageReader();
var people = reader.GetPeople().OrderBy(p => p.LastName);
var writer = new PersonConsoleWriter();
writer.WritePeople(people);
}
But, if we were so inclined, we could easily look at main and say “I want a magic box that I hand a person collection to and it gives me back a sorted person collection, and that could be stubbed out as follows:
public static void Main(string[] args)
{
var reader = new PersonStorageReader();
var people = reader.GetPeople();
var sorter = new PersonSorter();
var sortedPeople = sorter.SortList(people);
var writer = new PersonConsoleWriter();
writer.WritePeople(people);
}
The same kind of logic could also be applied for the “pre-pend the word ‘President'” requirement. That could pretty trivially go into the console writer or you could abstract it out. So, what about the file logic? I’m not going to bother with it in this post, and do you know why? You’ve created enough magic boxes here — decoupled the program enough — that it practically writes itself. You use an XDocument to pop all Person nodes and read their attributes into First and Last name, skipping any nodes that don’t have both. With Linq to XML, how many lines of code do you think you need? Even without shortcuts, the jump from our stubbed implementation to the real one is small. And that’s the power of this approach — deconstruct problems using magic boxes until they’re all pretty simple.
Also notice another interesting benefit which is that the problems of runtime optimization and exception handling now become easier to sort out. The exception handling and expensive file read operations can be isolated to a single class and console writes, sorting, and other business processing need not be affected. You’re able to isolate difficult problems to have a smaller surface area in the code and to be handled as requirements and situation dictate rather than polluting the entire codebase with them.
Pulling Back to the General Case
I have studiously avoided discussion of how tests or TDD would factor into all of this, but if you’re at all familiar with testable code, it should be quite apparent that this methodology will result in testable components (or endpoints of the system, like file readers and console writers). There is a deep parallel between building magic boxes and writing testsable code — so much so that “build magic boxes” is great advice for how to make your code testable. The only leap from simply building boxes to writing testable classes is to remember to demand your dependencies in constructors, methods and setters, rather than instantiating them or going out and finding them. So if PersonStorageReader uses an XDocument to do its thing, pass that XDocument into its constructor.
But this concept of magic boxes runs deeper than maintainable code, TDD, or any of the other specific clean coding/design principles. It’s really about chunking problems into manageable bites. If you’re even writing code in a method and you find yourself thinking “ok, first I’ll do X, and then-” STOP! Don’t do anything yet! Instead, first ask yourself “if I had a magic box that did X so I didn’t have to worry about it here and now, what would that look like?” You don’t necessarily need to abstract every possible detail out of every possible place, but the exercise of simply considering it is beneficial. I promise. It will help you start viewing code elements as solvers of different problems and collaborators in a broader solution, instead of methodical, plodding steps in a gigantic recipe. It will also help you practice writing tighter, more discoverable and usable APIs since your first conception of them would be “what would I most like to be a client of right here?”
So do the lazy and idealistic thing – imagine that you have a magic box that will solve all of your problems for you. The leap from “magic box” to “collaborating interface” to “implemented functionality” is much simpler and faster than it seems when you’re isolating your problems. The alternative is to have a system that is one gigantic, procedural “magic box” of spaghetti and the “magic” part is that it works at all.
Reaching back into my C++ days, a concept exists called “inline”. “Suggesting inline” is a concept where you tell the C++ compiler that you want to dispense with function call overhead and slam the code in question right into the code stream of the caller. (It’s a matter of suggesting because the compiler might ignore this request during its optimization such as if you decide to rip a hole in space-time by suggesting inline on a recursive method). So, for instance:
inline int multiply(int x, int y)
{
return x*y;
}
int main(int argc, char** argv)
{
int product = multiply(2,5);
}
is effectively transformed into:
int main(int argc, char** argv)
{
int product = 2*5;
}
This is conceptually similar to the concept of macros in C/C++. The idea is simple — you may have some block of code that you want to abstract out for reuse or readability, but you’d prefer the performance to mimick what would happen if you just wrote the code right in the method in question. You want to have your cake and to eat it too. Understandable — I mean, who doesn’t and what’s the point of having cake if you can’t eat it?
In the .NET world of managed languages, we don’t have this concept. It wouldn’t make any sense anyway as our builds generate byte code, which is processor-agnostic. The actual object code is going to be a matter between the runtime and the OS. So we don’t see the option for inline in the sense of runtime performance.
Metrics in OOP
One of the metrics in OOP that I think is a generally fair one is lines of code as a liability. If you have a 20-100 line class then life is good, but as you start creeping toward 300 lines it starts to smell. If it’s over 500 lines, kill it with fire (or, break it up into reasonable classes). The reason for this is a nod toward the Single Responsibility Principle and the concept of code smell. There’s nothing inherently wrong with large classes, per se, but they’re almost always an indication that a bunch of responsibilities are all knotted together in one tightly coupled mess. The same goes for methods on a smaller scale and with smaller scope of responsibility.
So, your personal preferences (and mine) about class/method size notwithstanding, it bears mentioning that smaller and more focused is generally considered better. The result of this is that it’s fairly common to evaluate class and method sizes with fellow developers and see people getting antsy as sizes get larger. On the flip side, it’s common to see people feel good when they keep class sizes small and to feel particularly good when they slay some hulking 5000 line beast and see it divided up into 20 more reasonably sized classes.
Another fairly common metric that I like is the number of parameters passed to a method. Like lines of code and large method/class, parameters trend toward code smell. No parameters is great, one is fine, two is a little noisy, and three is pushing it. More than that and you’ve written a method I want nothing to do with. I think a lot of people feel this way in terms of code they write and almost everyone feels this way when they’re a client (if a method has a bunch of parameters, good luck figuring out how to use it). This is true not only because it’s hard to keep all of the parameters straight but also because a method that needs a whole bunch of things to do its job is likely doing too large a job or too many jobs.
We like our methods/classes small and our parameter lists short.
Static and New: Gaming the System
One way to accomplish these goals is to ensure that your scopes are well defined, factored, and focused. But, if you don’t feel like doing that, you can cheat. If, for instance, you come across a constructor that takes 5 dependency parameters, the best thing to do would be to rework the object graph to have a more cohesive class that needed fewer dependencies. But, if time is short, you can just kick the pile of debris under the bed by having the constructor reach out into the static universe and pull its dependencies from the ether. Now your class has the cologne of a blissfully simple constructor hiding its dependnecy smell, thanks to some static or singleton access. The same thing can be applied with the “new” keyword. If you don’t want to rip holes in the fabric of your object graph with static state and functionality, you can always instantiate your own dependencies to keep that constructor looking slim (this would be identical in concept to having a stateless static factory method).
This trick also applies to cutting down lines of code. If you have a gigantic class, you could always port out some of its state and behavior out into a static class with static state or to an instance class spun up by the beast in question. In either case, lines of code goes down and arguments to public APIs stays the same.
Inline Revisited
Consider the following code:
public class Foo
{
public int GetTotal(int n)
{
int total = 0;
for (int index = 0; index < n; index++)
{
total += index;
}
return total;
}
}
Let’s say that we thought that GetTotal method looked way too long and we wanted to shorten it by kicking parts of it under the bed when company came by to look at the class. What about this:
public class Foo
{
public int GetTotal(int n)
{
return Utils.RunningSum(n);
}
}
This is fewer lines of code, to be sure. We’ve created a static Utils class that handles the dirty work of getting the running sum of all numbers up to and including a given number and we’ve delegated the work to this class. Not bad — we’ve eliminated 5 lines of code from both Foo and GetTotal(). And RunningSum is stateless, so there’s no worry about the kind of weird behavior and dependencies that you get from static state. This is a good thing, right?
Well, no, not really. I’d argue that this is fool’s gold in terms of our metrics. In a very real conceptual sense, Foo has no fewer lines of code than it initially did. Sure, from the perspective of organizing code it does — I’ve separated the concern of RunningSum from Foo’s GetTotal and we might make an argument that this factoring is a good thing (it’d be more interesting in a less trivial example). But from the perspective of coupling in the object graph, I’ve done exactly nothing.
When you call GetTotal(n), all of the same code is going to be executed in either case. All of the same branching will occur, all of the same logic will guide that branching, and all of the same local variables will be declared. From a dependency perspective as expressed in source code, you might as well inline Utils.RunningSum() into GetTotal(). To put this another way, you might as well conclude that Foo and GetTotal() are just as many lines of code as they ever were.
And that’s my larger point here. When your code invokes a static method or instantiates an object, client code calling your stuff has no choice in the matter. If I’m calling Foo’s GetTotal() method, it doesn’t make any difference to me if you call Utils.RunningSum() or just do the work yourself. It’s not as though I have any say in the matter. It’s not as though I can specify a strategy for computing the sum myself.
Now, by contrast, consider this:
public interface IComputeTotals
{
int RunningSum(int n);
}
public class Foo
{
public int GetTotal(int n, IComputeTotals computer)
{
return computer.RunningSum(n);
}
}
Here I have a method that allows me to specify a number and a strategy for totals computing and it returns the computed total. Is this like inline? No, of course not — as the client, I have a lot of control here. Foo isn’t inlining anything because it needs me to specify the strategy.
But what about encapsulation? With the code in the method or abstracted to a hidden collaborator (static or new) the details of computation are hidden from me as the client whereas here they may not be (depending on where/how I got ahold of an instance that implements that interface). To that I say that it’s admittedly a tradeoff. This latter implementation is providing a seam and giving more options to client code whereas the former implementation is hiding more but leaving client code with fewer options. Assuming that any static methods involved are stateless/functional, that’s really the main consideration here – how much to cover up/hide and how much to expose.
I certainly have a preference for inversion of control and an aversion to static implementations both because of my desire for decoupled flexibility and my preference for TDD. But your mileage may vary. All I’d ask of anyone is to make informed decisions with eyes wide open. Metrics/smells like those about parameters and class/method size don’t exist in a vacuum. “Fixing” your 5000 line class by creating a static class and delegating 4800 lines of work to it behind the scenes is not reducing the size of that class in any meaningful sense, and it’s not addressing the bad smell the class creates; you’re just spraying perfume on it and hoping no one notices. The real problem isn’t simply that 5000 lines of code exist in one source file but rather that 5000 lines exist with no way to pry the dependencies apart and swap in alternate functionality.
So when you’re coding and bearing in mind certain heuristics and metrics, think of using static method calls and instantiating collaborators as what they really are — cosmetic ways to move code out of a method. The fact that the code moves to another location doesn’t automatically make it any more flexible, easier to maintain or less smelly. It’s providing additional seams and flexibility that has the effect you’re looking for.
Have you ever started looking for something, let’s say your wallet, and been fairly certain that it wasn’t in your house? You started looking in all of the most likely places and progressed with increasing glumness to less and less likely landing spots. Perhaps you eventually wound up in the crawl space that you hadn’t entered since six months prior to losing your wallet. If you’ve had an experience like that, do you recall a point at which you thought to yourself, “this is completely pointless” and yet you kept doing it anyway?
Even if you can’t recall an experience like this, I imagine that you’ve seen similar stories unfold in countless other places. In politics and government for example, can you count the number of times some matter of policy has proven to be an obvious mistake and it keeps going anyway like an unstoppable juggernaut with resolute phrases like “stay the course” uttered as sober non sequitur? In the end it all seems silly; it seems to be a way to say “we know this is a mistake and we’re going to keep doing it anyway.”
I believe that this mystifying behavior has two main sources of persistence as a mainstay in the human condition. The first is that without careful consideration, we tend toward logical fallacy and this is an example of the fallacy “appeal to consequences” (aka “wishful thinking”). In other words, “I’m going to look for my wallet in the crawlspace because I believe it’s in there since it would be a real bummer if it weren’t.” We tend to double down on our mistakes because we really wish we hadn’t made them.
The second source of this is a truly fascinating study of our motivations as humans for our actions and interactions. A blog I really like called “You Are Not So Smart” defines this as “the Ben Franklin Effect“. I highly recommend a start-to-finish read of this post, and will warn you here that my next paragraph will be a spoiler, so go read it, please.
Oh, hi there. You’re back? Cool! As you now know, the premise of the post is that Ben Franklin figured out a way to turn someone who didn’t like him (a “hater”) into an ally. He asked that person to do him a favor… and it worked. As it turns out, we tend to like people because we do them favors, rather than doing them favors because we like them. The reason for this is that we construct narratives about our motivations and preferences that rationalize our past decisions; “I must like that guy since otherwise me doing him a favor would have been stupid and since I’m obviously not stupid…” (for you rhetoric buffs, this is another logical fallacy called “begging the question”)
David McRaney puts this nicely in his post:
If you live in the Deep South you might buy a high-rise pickup and a set of truck nuts. If you live in San Francisco you might buy a Prius and a bike rack. Whatever are the easiest to obtain, loudest forms of the ideals you aspire to portray become the things you own, like bumper stickers signaling to the world you are in one group and not another. Those things then influence you to become the sort of person who owns them.
What Does this Have to Do With Code Comments?
I’m glad you asked. The answer is that for me, and I suspect for a lot of you, putting comment headers above all methods, classes, fields, etc is a doubling down on a behavior that probably makes no sense, a staying of the course, and a thing that we like doing because we did it.
When I was in college, I don’t think I ever commented my code. At that time, as I recall, we were young and naive and wore code unreadability on our sleeves as a badge of honor; anyone who could execute an entire complex looping sequence in the condition and increment statements of a for loop was a total ninja, after all. When I left school and went to the professional world, I was confronted with people who wrote various kinds of code, but one was a Linux kernel hacker that I truly respected and I noticed that he dutifully put comment blocks above all of his methods. Included in these were descriptions of what the method did, what it returned, what the arguments were, what invariants it preserved, and a journal log of edits made to it.
This was an epiphany for me. All of those guys with their non-commented, zany for loops were amateurs. Real programmers did that kind of ninja stuff and then put a nice, clear comment on top that explained to mere mortals how awesome they were. I was hooked. I wanted to be that guy and so I became a guy that wrote those kind of comments above methods and everywhere else. I didn’t do this halfway — I was the kind of guy that went the extra mile to thoroughly document everything.
Years, jobs and programming languages later, this practice persisted, long outlasting my opinion that it was a badge of honor to cram multiple execution statements into loop iterators and furrow people’s brows with inscrutable pointer arithmetic (having gravitated toward code that was clear, readable and maintainable). A funny thing started happening to me, though. I started reading blog posts like this one, where people said things like:
What makes comments so smelly in general? In some ways, they represent a violation of the DRY (Don’t Repeat Yourself) principle, espoused by the Pragmatic Programmers. You’ve written the code, now you have to write about the code.
…
First, where are comments indeed useful (and less smelly)? If you are writing an API, you need some level of generated documentation so that people can use your API. JavaDoc style comments do this job well because they are generated from code and have a fighting chance of staying in sync with the actual code. However, tests make much better documentation than comments. Comments always lie (maybe not now, but on a long enough timeline, all comments will become outdated). Tests can’t lie or they fail. When I’m looking at work in progress on projects, I always go to the tests first. The comments may or may not be there, but the tests define what’s now done and working.
The first few times I read or heard arguments like this, I dismissed them as the work of people too lazy to write comments. As I started reading more such posts… I simply stopped thinking about them. It was a silly argument — after all, I had spent countless hours throughout my career writing comments and I wouldn’t spend all that time doing something that might not be a good idea. A friend of mine likes to satirize this type of reasoning by saying “if the facts don’t fit the dogma, the facts must be discarded,” and while the validity of comments in code is truly a matter of opinion rather than fact, the same logic applies to my reasoning. I discarded an argument because not discarding it would have made me feel bad about prior decisions that I had made.
Nagging Doubts and Real Conclusions
In spite of my best efforts simply to ignore arguments that I didn’t like, I was plagued by cognitive dissonance on the matter. Why would otherwise rational and obviously quite intelligent people say such ridiculous things about my beloved exhaustive commenting? Determined to reconcile these divergent beliefs, I decided one day while documenting code smells on a wiki for a former employer that they didn’t really mean that comments were a code smell in general – they must only be talking about stupid comments like putting “//this is a for loop” above a for loop. Yeah, of course. That’s what they meant.
But, it wasn’t. In spite of myself, I had read and processed the arguments. I knew that comments weren’t DRY and that they represented duplication of the logic of the code. I knew that heavily commented code tended to mask poor naming choices and unintuitive abstractions. I knew that comments, even XML/Javadoc comments for public-facing APIs tended to start out helpful and accurate and then to rot over time as different people took over maintenance and didn’t think to change them along with the code. Heck, I’d experienced that one firsthand with inaccuracies and inconsistencies piling up as surely as they do in a non-normalized database.
So, eventually I was forced to see the light and admit that it was possible and even probable that my efforts had been a complete waste of time over the years. Sound harsh? Well, here’s the thing. How many of those comments that I wrote are still floating around in some code base somewhere, not deleted? And of that percentage, how many are accurate? I’m guessing the percentage is tiny — the comments are dust in the wind. And before they blew away, what are the odds that anyone cared enough to read them? What are the odds that anyone cares that “ebd changed a char* to a int* on 4/15/04”?
So, I’ve stopped viewing comments as obligatory. I write them now with a sort of “when in Rome” sentiment if others do it on the code I’m working on. After all, I have plenty of practice with it. But my real preference, in light of my focus on abstractions, is now to have the attitude that I am forbidden from writing comments. If I want to make my code usable and understandable, I have to do it with tight abstractions, self-documenting code and making bad decisions impossible. Is that a lofty goal? Sure. But I think it’s a good one. I’ve gone from viewing comments as obligatory to viewing them as indicative of small design failures, and I’m content with that. I think that all along part of me viewed commenting my methods as redundant and I’ve finally rid myself of the cognitive dissonance.
I was talking with someone about possible approaches to an API the other day. He asked me if I’d favor a method that took a parameter and had a return value or if I’d prefer a method that was void and took two parameters but with one as a ref parameter. My immediate, knee-jerk response was “I don’t like ref parameters one bit, so I’d prefer the former.” He looked at me with a bit of surprise and then kind of shrugged as if to say, “whatever, weirdo” and went with former. To him it was six or half a dozen.
This made me wonder whether I’m being dogmatic and rigid in the way I approach software, so I spent some time thinking and reading about ref parameters and their cousin, the out parameter. For those of you not acquainted with C#, both of these are ways of passing parameters into methods with the main difference being whether the called method must change the value passed in (out) or whether it can optionally do so (ref). Another way of thinking of this is that you would use out when the initial value of the parameter does not matter and ref when it does.
Here is what the syntax looks like:
int myValue;
//This wouldn't compile, so don't do it: AddOneToValue(ref myValue);
SetValueToTwelve(out myValue);
Console.WriteLine(myValue);
AddOneToValue(ref myValue);
Console.WriteLine(myValue);
In this case, you will see printed 12 and then 13 on the next line, assuming the methods do what they say they will. (This gets even screwier if you’re a Java programmer, in which case you need to create a wrapper class or use int[] or something since ints are immutable and parameters are always passed by value even if objects are accessed on the heap by reference).
What Does the Internet Think?
Usually when I react strongly and then want to justify my reaction, I go see what others have to say, preferably those I respect. The estimable Jon Skeet has this to say:
Basically out parameters are usually a sign that you want to effectively return two results from a method. That’s usually a code smell – but there are some cases (most notably with the TryXXX pattern) where you genuinely want to return two pieces of information for good reasons and it doesn’t make much sense to encapsulate them together.
In other words, avoid out/ref where you can do so easily, but don’t go massively out of your way to avoid them.
In the answer below his, Ryan Lanciaux raises an interesting point when he says that “[out/ref parameters] basically add side-effects to your code and could be a nightmare when it comes to debugging.”
So, two take-aways here are that having a method return two distinct results is a code smell and that method side effects tend to be a problem. On the flip-side of the argument mainly seems to be somewhat of a pragmatic, duct-tape programming kind of argument that sometimes the purist approach just isn’t worth the effort and potential awkwardness. The iconic example of using out parameters is the T.TryParse(string, out t) methods in C# (which I really don’t like, but I’m trying to be suspend my bias for the sake of investigation).
Next up, here’s what, well, MSDN has to say in explanation of a static analysis design warning that they raise entitled “Avoid out parameters”:
Passing types by reference (using out or ref) requires experience with pointers, understanding how value types and reference types differ, and handling methods with multiple return values. Also, the difference between out and ref parameters is not widely understood.
…
Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.
There’s a certain irony to this, but I definitely understand the point. The irony is that the same outfit that put these features into the language raises a warning telling you not to use them. Don’t get me wrong — I understand that this is akin to making computers that can be taken apart while warning users not to do so, since many of them aren’t really qualified — but the irony amuses me nonetheless. It’s also interesting that MSDN seems to think that pointers and reference vs value are “intermediate” language features. Perhaps the fact that I cut my teeth on C and C++ as a wee programmer is showing, but… seriously? Intermediate?
At any rate, the consensus on the subject that I’ve seen at these and a variety of other blogs and stack overflow posts seems to be that out/ref parameters are generally to be avoided… except when they’re sort of unavoidable, either because of interop concerns or because you really want (need?) a function that returns two or more things.
Do One Thing
But isn’t a function that does two things a violation of the Single Responsibility Principle of SOLID fame, applied at the method level? And aren’t out/ref parameters, pretty much by definition, side effects that constitute violations of Command-Query Separation, a paradigm in which methods that mutate state (commands) are separated from methods that retrieve information (queries) and ne’er the twain shall meet? I mean, any method that ‘returns’ two values is, well, doing at least two things and any method that mutates object state and kicks back a ref/out parameter is serving as command and query.
But what about methods like the obtuse ones above, SetValueToTwelve() and AddOneToValue()? Those are void methods that mutate only the out/ref parameter. They could be made static and rewritten as int Return12() and int AddOneToValue(int value) without altering their purpose or effect. So, they’re not really violating SRP or CQS, right? They’re just slightly more awkward versions of familiar APIs.
But that really hits home with me. Why do I want something that’s either slightly more awkward or a violation of some very helpful clean coding practices? I mean, we’re not really shooting for the moon there are we, if something is at best somewhat awkward and at worst a code smell. Why do it at all? Methods should pick one thing and do it (or return it) and should do it as non-awkwardly as possible.
What If Light Switches Worked This Way?
I like to think of our task as programmers in terms of abstractions (in case you hadn’t caught my series on abstractions, feel free to click that tag and see me harp on it in a lot of posts). One easy abstraction for me to relate to the world of programming is turning a light switch on and off. This is a classic case of a class Light that exposes a command SetOnValue(bool) and exposes a readonly property, bool IsOn. So, as in the real world, I move the switch up or down and then separately observe the results. (Let’s ignore for argument sake that it might be better to model “Switch” and “Light” as separate entities).
This is a great example of Command-Query Separation. Toggling the light on or off is a command, and looking at the light to see whether or not it’s on is a query. But, let’s blur the lines for a moment and rewrite this so that there is no readonly “IsOn” property. Instead, SetOnValue(value) will return a boolean indicating whether the light is on or not. So now, we have a switch that also acts as the thing that tells us whether or not it’s on — our wall switch also just became a light. Now, when you toggle the switch, the switch itself glows to give you feedback. Weird.
But, it gets weirder. Instead of having our SetOnValue() function return a bool, let’s feed it a ref parameter. On the way in, we’ll indicate the value we want, and on the way out, it will indicate the value that we’re going to get. In terms of modeling the real world, ref parameters are kind of mind-blowing. You hand some external thing a piece of yourself and it alters that for you. So now, we have a light switch that I flip on with my hand, and indicates the success of that operation by modifying my hand – let’s say burning it. So, I flip the switch on, and if it works, the blazing hot bulb in the switch burns my hand (but gives off no light, since the burn is how I know it’s working). So, there you have it – a strange path from a light switch that turns on lights to one that simply injures me.
I realize that this metaphor is a touch strained, but here’s the thing: ref and out parameters are weird and counter-intuitive. It’s hard for them not to strain a metaphor that they’re involved in. Anything I’m handed in life could be represented conceptually as thing or a collection of things. Any action I take in life could be represented as a void method with 0 or more parameters. But what is a ref paramter? Where in life do I take something, set it to a way I want it, give it to something, and then have it given back to me differently? Maybe as part of an assembly line or in some weird, Rube-Goldberg kind of process, but this is hardly how normal interactions are modeled.
Ref and out are leaky abstractions in terms of code readability. They reek of code. Code involving these things doesn’t read like simple, well written prose or a nice story — it reads like a procedural construct such as the withholding worksheet for your paycheck deductions. So, like dealing with the IRS, why don’t we avoid that if we can? And, I think you’d be pretty hard-pressed to argue that you can’t.
I am Erik Dietrich, founder of DaedTech. I’m a former programmer, architect, and IT management consultant, and current founder and CEO of Hit Subscribe.