You’re Doin’ It Wrong
I was chatting with someone the other day about some code that he had written, and he expressed that he wished he had done it another way. He said that there was a bunch of duplication and that, while he wasn’t sure what he should have done, he didn’t think it seemed quite right. I took a look at the code and was inclined to agree, though I could think of how I might have approached it.
This led me to think a little more broadly about writing code that you feel not quite right about. One might simply call this a “code smell”, but I think a distinction can be made. I personally associate code smells more with code from which you are somewhat removed, either because someone else wrote it or because you wrote it some time back. That is, you wouldn’t write code that “smelled” to you if it smelled as you were writing it (hopefully).
So, what I’m documenting here is not so much ways to look at code and “smell” that something is wrong, but rather a series of heuristics to keep in mind during development to know that you’re probably doing something wrong. Code smells are more macroscopic in nature with things like “parallel inheritance hierarchy” and “anemic domain model”. These are practices that, as I’ve learned from experience, seem to lead to bad code.
Typing the same thing again and again
The incident that made me think of this involved a piece of code that looked something like this:
public class SomeViewModel
{
int CustomerId1 { get; set; }
string CustomerName1 { get; set; }
int CustomerId2 { get; set; }
string CustomerName2 { get; set; }
int CustomerId3 { get; set; }
string CustomerName3 {get; set; }
....
public void GetCustomers()
{
CustomerId1 = _SomeService.GetCustomer(1).Id;
CustomerName1 = _SomeService.GetCustomer(1).Name;
...
}
....
}
Now, there are a few things going on here that could certainly be improved upon, but the first one that jumps out to me is that if you’re appending numbers to fields or properties in class, you’re probably doing something wrong. We could certainly take our IDs and names and turn them into arrays, or lists or collections and iterate over them. But, even doing that, you’d have a bunch of parallel loops:
public class SomeViewModel
{
List CustomerIds { get; set; }
List CustomerNames { get; set; }
public void GetCustomers()
{
CustomerIds = _SomeService.GetCustomers().Select(c => c.Id);
CustomerNames = _SomeService.GetCustomers()Select(c => c.Name);
}
}
Each place that you want to do something with customer IDs, you’ll also have to do it with names. This gets particularly ugly if you decide you want to sort the lists or something like that — now you have to make sure the lists are rearranged in order – you take on responsibility for guaranteeing that your multiple list implementation remains consistent. This is repetitive still. Better to create a customer model struct with properties name and ID or something, and expose a list (or observable collection or whatever) of that.
The heuristic I can offer here is that if you’re typing the same thing over and over again, you’re probably doing it wrong. This is an offshoot of the DRY (Don’t Repeat Yourself) principle, but ironically, I think it bears some repeating here. DRY doesn’t simply mean that you shouldn’t copy and paste code, which is what many people remember. It doesn’t even mean that you shouldn’t duplicate code. Dave Thomas calls it something “much grander”:
DRY says that every piece of system knowledge should have one authoritative, unambiguous representation. Every piece of knowledge in the development of something should have a single representation. A system’s knowledge is far broader than just its code. It refers to database schemas, test plans, the build system, even documentation.
There should be no duplication of knowledge in the project, period. As such, typing the same thing over and over again has the whiff of duplication, even if you aren’t copy/pasting. If you perform a series of operations on customer ID 1 and then customer name 1, and then customer ID 2, and then customer Name 2, etc, you’re repeating that operation over and over again, albeit with a nominally different target. But, what if in the database or lower layers of memory, I switch customers 1 and 2? The 1 operation is then done on 2 and vice versa. These operations are interchangeable – we have duplicated knowledge. And, duplicated knowledge eventually leads to things getting out of sync as the system evolves, which means maintenance headaches, code rot, and defects.
Making a private member public and/or an instance method static
Let’s say that we have the following classes:
public class Customer : INotifyPropertyChanged
{
private int _id;
public int Id { get { return _id; } set { _id = value; NotifyChange(() => Id); }
private string _name;
public string Name { get { return _name; } set { _name = value; NotifyChange(() => Name); } }
//PropertyChange Elided
}
public class CustomerProcessor
{
private int _processedCount;
public void ProcessCustomers(IEnumerble customers)
{
if(customers != null)
{
customers.ToList().ForEach(cust => this.DoSomethingToCustomer(c));
_processedCount += customers.Count();
}
}
public bool IsFinished()
{
return _processedCount > 12;
}
}
Now, let’s say that the GUI is bound directly to the customer object, and that a new requirement comes in – we want to display customer stuff in black, unless most of the customers have been processed (say, 3/4s of them), in which case we want to display them in red — for all customers. Let’s also assume that we have some kind of value converter that converts a boolean to red/black for binding. Here’s how someone might accomplish this:
public class Customer : INotifyPropertyChanged
{
private int _id;
public int Id { get { return _id; } set { _id = value; NotifyChange(() => Id); }
private string _name;
public string Name { get { return _name; } set { _name = value; NotifyChange(() => Name); }
public bool IsRed { get { return CustomerProcessor._ProcessedCount > 9; } }
}
//PropertyChange Elided
}
public class CustomerProcessor
{
public static int _processedCount; //This is now a public global variable
public void ProcessCustomers(IEnumerble customers)
{
if(customers != null)
{
customers.ToList().ForEach(cust => this.DoSomethingToCustomer(c));
_processedCount += customers.Count();
}
}
public bool IsFinished()
{
return _processedCount > 12;
}
}
Don’t do that! That might be the quickest and dirtiest way to accomplish the task, but it sure is ugly. We break encapsulation by making an internal detail of the processor public, we create a global variable to get around the issue that the customer has no reference to the processor, and we’ve only created a half-way solution because the customer still has no mechanism for knowing when the processor’s count passes the magic number 9 – we’d need to expose a global event or callback so that the customer would know when to raise property changed on its color. This is also a concrete and circular coupling – CustomerProcessor and Customer now know all about one another and might as well be one static, global mashup.
Private members are private for a reason — they’re implementation details hidden from the code base at large. By having the magic number 12 and the processed count hidden in the processor, I can later change the scheme by which the processor evaluates “finished” without breaking anything else. But, the way I’ve mangled the code here, the requirement that “red” be 3/4s of the finished count is now leaked all over the place. If I go to the processor and decide that finished count is now 400 (or better yet, configurable and read from persistence), our new requirement is no longer satisfied, and as the maintenance programmer for CustomerProcessor, I have no way of knowing that (except maybe for the hint that someone exposed my private variable).
Likewise, instance members are instance members for a reason. If I later decide that I want two separate customer processors working in two separate screens, I’m now hosed with these changes. However many processors I create, if any of them process 12 records, they’re all finished.
Far too often, the reason for making privates public or instance members static is convenience and not wanting to think through dependencies. Both of those make the information that you want easier to get at. But, I submit that if this is your solution to get your objects to talk to one another, you’re likely creating a short-sighed and misguided solution. In our example above, exposing an “IsAlmostFinished()” instance member on the processor and then having the GUI bind to something that knows about the customers and their processor is a better solution. It requires a bit more planning, but you’ll be much happier in the long run.
You can’t describe a method in a simple sentence using its parameters and return value
“Too many parameters” is considered a code smell. But, how many is too many? Recently, I stumbled across a method that looked something like this:
public void DoSomething(bool shouldPerform)
{
if(shouldPerform)
{
//A whole bunch of processing
}
}
This method doesn’t have too many parameters at all, I wouldn’t think. It has exactly one parameter, which is basically the programmatic equivalent of the ubiquitous “Are you sure you want to…” message box. But, I couldn’t for the life of me summarize it in a simple sentence, and perhaps not a paragraph. It was something like “If the user passes in true, then find the hardware processor and hardware presenter, and if each of those isn’t null, find the database manager, but if the hardware presenter is null, then create a file, and…”
If you’re writing a method, do a thought exercise. If someone stopped by and said, “whatcha doin?”, could you look up at them and quickly say “I’m writing a method that…”? Or would you say, “go away and don’t break my concentration, or I’ll never remember what this thing was supposed to do”? If it’s the latter, you’re doing it wrong.
We can only process so much information at once. If you’re reading through someone else’s code or your own code from a while ago, methods that are small and focused are easy to understand and process mentally. Methods that have dozens of parameters or do dozens of things are not. It doesn’t matter how smart you are or how good at programming you are – when you encounter something like that, you’ll put your pen to the monitor and start tracing things or take that same pen and start taking notes or drawing diagrams. If you have to do that to understand what a method does, that method sucks.
Consider the following interfaces:
public interface ICustomerDao
{
void DeleteCustomer(Customer customer);
void InsertCustomer(Customer customer);
void UpdateCustomer(Customer customer);
Customer GetCustomerById(int customerId);
IEnumerble GetAllCustomers();
}
public interface ICustomerService
{
bool LogCustomerDetails(Customer customer);
Customer MergeCustomers(Customer customer1, Customer customer2);
Customer FindCustomer(IEnumerable customerForSearch);
}
There are no method definitions at all – just signatures. But, I’d be willing to be that you can already tell me exactly what will happen when I call those methods, and I bet you can tell me in a simple, compact sentence. Wouldn’t it be nice if, when you were coding, you only encountered methods like that? Well, if you think so, do others and your future self a favor, and only define and code them like that.
Defining class behavior for one specific client
Remember that method that I showed above? The one with the “are you sure” boolean parameter? I’m pretty sure that was neither the author’s idea of a joke nor was it quite as silly as it looked. I’d wager dollars to donuts that this is what happened:
public class Unmaintainable
{
public bool IsSomethingTrue() { ... }
public void WayToBig()
{
//....
var myExtracted = new ExtractedFromTooBig();
myExtracted.DoSomething(IsSomethingTrue());
}
}
public class ExtractedFromTooBig
{
public void DoSomething(bool shouldPerform)
{
if(shouldPerform)
{
//A whole bunch of processing
}
}
}
When looked at in this light, it makes sense. There was some concept that a class was too large and that concerns should be separated. Taken as a gestalt, it makes sense, but simply looking at the extracted class, “DoSomething(bool)” looks pretty dumb.
While the effort to refactor to smaller units is laudable, creating an interface that only makes sense for one client is not so good. Your classes should make sense on their own, as units. If you extract a class that seems hopelessly coupled to another one, it’s probably time to reconsider the interface, or the “seam” between these two classes. In the case above, the boolean parameter need not exist. “IsSomethingTrue()” can exist in an if condition prior to invocation of the parameterless “DoSomething()” method. In general, if your class’s public interface doesn’t seem to make sense without another class, you’re probably doing it wrong. (This is frequently seen in “helper” classes, which I mentioned in this post about name smells).
[…] few posts ago, I mentioned a piece of code that contained an API with a quirky method […]