Why Are Public Fields No Good, Again?
Today, a developer on my team, relatively new to C# and some of the finer points of OO, asked me to take a look at his code. I saw this:
public class Address
{
public string Street1;
public string Street2;
public string City;
public string State;
public string Zip;
}
How would you react to this in my shoes? I bet like I did–with a quick “ooh, no good…try this:”
public class Address
{
public string Street1 { get; set; }
public string Street2 { get; set; }
public string City { get; set; }
public string State { get; set; }
public string Zip { get; set; }
}
Busy and pleased with myself for imparting this bit of wisdom, I started to move on. But then I realized that I was failing at mentoring. Nothing annoyed me more as a junior developer than hearing something like, “That’s just the way you do it.” And there I was saying it.
The creatures outside looked from pig to man, and from man to pig, and from pig to man again; but already it was impossible to say which was which.
I stopped at this point and forced myself to talk about this–to come up with a reason. Obviously, I hadn’t been thinking about the subject of why public fields are icky, so slick terms like “encapsulation” and “breaking change” weren’t on the tip of my tongue. Instead, I found myself talking sort of obliquely and clumsily explaining encapsulation in terms of an example. Strangely, the first thing that popped into my head was threading and locking the field to make it thread-safe. So I said something along the lines of this:
With a public field, you simply have a variable that can be retrieved and set. With a property, you’re really hiding that variable behind a little method, and you have the ability to centralize operations on that variable as needed rather than forcing them on everyone who uses your code. So imagine that you want to make these fields thread-safe. In your version, you force all client code that uses your class to deal with this problem. In the property version, you can handle that for your clients so that they needn’t bother.
Yeah, things are always more awkward off the cuff. So I ruminated on this a bit and thought I’d offer it up in blog post format. Why not have public fields (aside from random thoughts about threading)?
The Basic Consideration: Encapsulation
As amused by my off-the-cuff threading example as I am, it does drive at a fundamental tenet of OOP, which is class-level encapsulation. If I’m writing a class, I want to expose a set of public functionality (methods and properties) that describe what an instance of the class will do while hiding how it will do it. If the address class is in some other assembly and I don’t decompile or anything, as a client, I have no idea if the City property is auto-implemented, if it wraps a simple property, or if all kinds of magic happens behind the scenes.
I know (hopefully) that if I set the City property to some value and then read it back, I get that value. But beyond that, I dunno. Maybe it implements lazy loading from some persistence store. Maybe it tracks all of the things you’ve set City to throughout its entire lifetime. I don’t know, and I don’t want to know if it isn’t part of the public API.
If you use public fields, you can’t offer consumers of your code that blissful ignorance. They know, ipso facto. And, worse, if they want any of the cool stuff I mentioned, like lazy loading or tracking previous values, they have to implement it themselves. When you don’t encapsulate, you fail to provide any kind of useful abstraction to me. If all you’re giving me is a field-bag, what use do I have for an instance of your class in mine? I can just have my own string variables, thank you very much, and I will hide them from my clients.
The Intermediate Consideration: Breaking Changes
There’s a more subtle problem at play here as well. Specifically, public properties and public fields look and seem pretty similar, if not identical, but they’re really not. Under the hood, a property is a method. A field is, well, a field. If you have a property and you decide you want to change its internal representation (going from something complex to simple property or automatic property), no harm done. If you want to switch between a field and a property though, even the trivial switch I showed above, there be dragons–especially going from field to property.
It seems like a pretty obvious case of YAGNI to start out with a field and move to a property if you need it, but things start to go wrong. Weird and subtle things. And they go wrong on you when you least expect them. Maybe you’re using the public members of the Address class as part of some kind of reflection. Well, now things are broken because properties and fields are completely different here. Perhaps you have a field that you were using as an out parameter somewhere. Oops. And, perhaps most insidiously of all, just changing a field to a property will cause runtime exceptions in dependent assemblies unless recompiled.
Let me put that another, scarier way. Let’s say that I write something called Erik’s Awesome DLL and I distribute it to to the world, where it gains wide adoption. Let’s also say that I created a bunch of public fields that the world uses as part of my DLL’s API. And finally, let’s say that in vNext, I decide that I want some encapsulation after all. Maybe I want to implement thread-safety. I implement and publish vNext to Nuget, and you update accordingly. The next time you hit F5, your application will blow up, and it will continue to blow up until you recompile. It may not sound like a big deal, but you’re definitely going to annoy users of your code with stuff like this.
So beware that you’re casting the die here more quickly and strongly than you might think. In most cases, YAGNI applies, but in this case, I’d say that you should assume you are going to need it. (And anything that stops people from using out parameters is good in my book)
The Advanced Consideration: Tell, Don’t Ask
I’ll leave off with a more controversial and definitely the most philosophical point. Here’s some interesting reading from the Pragmatic Bookshelf, and the quote I’m most interested in here describes the concept of “Tell, Don’t Ask.”
[You] should endeavor to tell objects what you want them to do; do not ask them questions about their state, make a decision, and then tell them what to do.
I’m calling this controversial because the context in which I’m mentioning it here is a bit strained. Address, above, is essentially a data transfer object whose only purpose is to store data. It will probably be bubbled up from somewhere like a database and make its way onto something like a form, with perhaps no actual plumbing code that ever does ask it anything. But then again, maybe not, so I’m mentioning it here.
Public fields are the epitome of “ask.” There’s no telling but not asking–they’re all “ask and tell.” Auto-properties are hardly different, but they’re a slight and important step in the right direction. Because with a property, awkward as it may be, we could get rid of the getter and leave only the setter. This would be a step toward factoring to a (less awkward) implementation in which we had a method that was setting something on the object, and suddenly we’re telling only and asking nothing.
“Tell, don’t ask” is really not about setting data properties and never reading them (which is inherently useless), but about a shift in thinking. So in this example, we’d probably need one more step to get from telling Address what its various fields are only to telling some other object to do something with that information. This is getting a little indirect, so I won’t belabor the point. But suffice it to say that code riddled with public fields tends to be about as far from “tell, don’t ask” as you can get. It’s a system design in which one piece of code sets things so that another can read it rather than a system design in which components communicate via instructions and commands.
So I feel clean and refreshed now. I didn’t shove off with “I’m in charge and I say so” or “because that’s just a best practice.” I took time to construct a carefully reasoned argument, which served double duty of keeping me in the practice of justifying my designs and decisions and staying sharp with my understanding of language and design principles.
I always think of things in terms of language constructs.
If
someone walked up to you and said “Customer Address” there’s an implied
question, and that person wants the customers address.
In the
property version the person walks up and says “Customer Get Address.”
There is no implied question there. They’re giving a command.
“Tell, don’t ask” is like saying don’t be that insecure person that puts a question mark at the end of all your sentences.
Wow something murdered my formatting there…
I like that way of putting it a lot. I’m going to keep that in my head as I’m writing code and think about how frequently or infrequently what I’m doing sounds like it should end in a question mark.
(Dunno about the formatting — the vagaries of Disqus, I suppose)