In Search of the Perfect Code Review
One thing that I tend to contemplate from time to time and have yet to post about is the idea of code reviews and what constitutes a good one. I’ve worked on projects where there was no code review process, a half-hearted review process, an annoying or counter-productive code review process, and a well organized and efficient code review process. It’s really run the gamut. So, I’d like to put pen to paper (finger to keyboard) and flesh out my ideas for what an optimal code review would entail. Before I can do that, however, I think I need to define some terms up front, and then identify some things about code reviews that I view as pitfalls, counter-productive activities or non-useful activities.
What is a Code Review?
According to the wikipedia entry for a code review, there are three basic types: the formal review, pair programming, and the lightweight review. I’ll add a fourth to this for the sake of pure definition, and call that the “automated review”. The automated review would be use of one or more static analysis tools like FX Cop or Style Cop (the wiki article includes someone using tools like this on the code of another developer, but I’m talking strictly automated steps like “you fail the build if you generate Style Cop warnings”). Pair programming is self explanatory for anyone familiar with the concept. Lightweight reviews are more relaxed and informal in the sense that they tend to be asynchronous and probably more of a courtesy. An example of this is where you email someone a source code file and ask for a sanity check.
The formal review is the one with which people are probably most familiar if code review is officially part of the SDLC on the project. This is a review where the developer sits in a room with one or more other people and presents written code. The reviewers go through in detail, looking for potential bugs, mistakes, failures to comply with convention, opportunities for improvement, design issues, etc. In a nutshell, copy-editing of code while the author is present.
What I’ll be discussing here mainly pertains to the formal review.
What I Don’t Like
Here are some things that I think ought to be avoided in a code review process.
1. Failing to have code reviews
This is important in the same way that QC is important. We’re all human and we could all use fresh eyes and sanity checks.
2. Procedural Code Review: The Nitpick
“You made that camel case instead of Pascal case.” “You could be dereferencing null there.” In general, anything that a static analysis tool could tell someone a lot faster and more accurately than a room full of people inspecting code manually. Suresh addresses this idea in a blog post:
I happen to see only “superficial” reviews happening. By superficial, I mean the types where you get review comments like, “You know the documentation for that method doesn’t have the version number”, or “this variable is unused”, etc.
“Superficial” is an excellent description as these things are generally trivial to identify and correct. It is not an effective use of the time of the developer or reviewers anymore than turning off spell check and doing it manually is an effective use of authors’ and editors’ time. There are tools for this — use them!
3. Paranoid Code Review
This is what happens when reviewer(s) go into the review with the notion that the only thing standing between a functioning application and disaster is their keen eye for the mistakes of others. This leads to a drawn out activity in which reviewers try to identify any possible, conceivable mistake that might exist in the code. It’s often easily identified by reviewers scrunching their noses, concentrating heavily, pointing, or tracing code through control flow statements with their fingers on the big screen while the developer twiddles his thumbs.
Again, there’s a tool for this. It’s called the unit test. It’s not flawless and it assumes a decent amount of coverage and competence from the unit tests, but if executed properly, the unit tests will express and prove corner case behavior far better than people on too little sleep staring at a screen and trying to step through the call stack mentally. This mental execution is probably the least reliable possible way of examining code.
4. Pure Gatekeeper Code Review
This is less of an individual property of code review, but more a property of the review process. It’s where you have a person or committee in the department that acts as the Caesar of code, giving thumbs up or thumbs down to anything that anyone submits. Don’t get me wrong — the aim of this makes sense. You want somebody that’s doing a sanity check on the code and someone who more or less has his or her finger on the pulse of everything that’s happening.
The issue that occurs here is a subtle one that results from having the same person or people reviewing all code. Specifically, those people become the gatekeepers and the submitters become less concerned about writing good code and innovating and more principally concerned with figuring out how to please the reviewer(s). Now, if the reviewers are sharp and savvy, this may not be a big deal, though discouraging new ideas and personal growth is always going to have some negative impact. However, if the reviewers are not as sophisticated, this is downright problematic.
This is relatively easily addressed by rotating who performs code reviews or else distinguishing between “suggestion” and “order”. I’ve participated in review processes where both of these mitigating actions were applied, and it’s a help.
5. Tyrant Gatekeeper
In the previous example, I mentioned a hypothetical gatekeeper reviewer or committee with ultimate authority, and this is a subset of that. In this case, the reviewer(s) have ultimate yes/no authority and are very heavy (and possibly even combative or derisive) with the “no” option. Where the previous example might stifle innovation or developer growth, this creates bottlenecks. Not only is it hard to get things approved, but developers will naturally start asking the reviewer(s) what to do at every turn rather than attempting to think for themselves.
In essence, this creates a state of learned helplessness. Developers are more concerned with avoiding negative feedback at the code reviews than learning, doing a good job, or becoming able to make good decisions based on their own experience. As a result, the developers don’t really make any decisions and ask the reviewer(s) what to do at every step, waiting until they have time, if necessary. The review(s) become a bottleneck in the process.
I have not personally witnessed or been subject to this form of code reviews, but I have heard of such a thing and it isn’t difficult to imagine this happening.
6. Discussion Drift
This occurs during a code review when a discussion of the specific implementation gets sidetracked by a more general discussion of the way things ought to be. Perhaps the code is instantiating an object in the constructor, and a reviewer recommends using dependency injection instead. From here, the participants in the review being to discuss how nice it would be if the architecture relied on a IOC framework instead of whatever it is at the moment.
That’s both a valid and an interesting discussion, but it has nothing to do with some developer checking in implementation code within the framework of the existing architecture. Discussions can have merit and still not be germane to the task at hand.
7. Religious Wars
This occurs during a code review in the following context:
Reviewer 1: You didn’t put curly brackets around that one liner following your if statement. Please add them.
Reviewer 2: No, don’t do that. I hate that.
Reviewer 1: No, it’s better. That way if someone changes the code and adds another statement…. etc.
Reviewer 2: Studies have shown….
Review-ee: Uh, guys….
And so on and so forth. Code reviews can easily devolve into this sort of thing over purely or nearly-purely subjective matters. People get very entrenched about their own subjective preferences and feel the need to defend them. We see this from political affiliation to rooting for sports teams. Subjective matters of preference in code are no different. Neither side is likely to convince the other during the scope of the review, but what is quite likely, and probably certain, is that time will be wasted.
If matters like that are part of the coding standards policy on the project, than it’s an open and shut case. If they’re not, they’re better left alone.
So, How Does a Good Code Review Go?
Having explained what I don’t like in a code review, I’ve provided some context for what I do find helpful. I’m going to outline a procedure that is simply an idea. This idea is subject to suggestions for improvement by others, and ongoing refinement from me. Also, this idea is geared toward the gatekeeper scenario. I may make another post on what I perceive to be the most effective method for voluntary(ish), peer-conducted code reviews where the reviewer(s) are not approval gate-keepers.
- Developer finishes the pre-defined unit of code and is ready to have it reviewed for promote/commit.
- Developer runs static analysis tools (e.g. StyleCop, FXCop, Code Contracts, NDepend, etc) with configured rule-sets, correcting any errors they uncover.
- Once no static-check violations are present, developer notifies reviewer(s) of desire for code review.
- Reviewers run static analysis tools asynchronously and reject the request if any rules are violated.
- Reviewers examine the code for obvious mistakes not caught by static analysis and/or developer unit tests, and write unit tests that expose the deficiency. (Alternatively, they can run something like Pex)
- Developer makes reviewer unit tests pass or else convinces reviewer(s) why they don’t need to.
- With all unit tests passing, and all parties familiar with the code, a review meeting is setup (meeting can be skipped for smaller/less crucial code deliveries).
- Meeting proceeds as follows:
- A mediator who is neither developer nor reviewer is asked to attend to keep the meeting focused and on track.
- Reviewers point out something praiseworthy about the submitted code (cheesy, perhaps, but important for starting off in the spirit of cooperation)
- Reviewers examine code for redundancy (is anything copy/pasted, defined in many places, etc)
- Reviewers examine the code for usable API, perhaps by implementing classes in a sandbox, to highlight shortcomings, unintuitive interactions, weird couplings, etc
- Reviewers check for architectural consistency — does the class implement a base class that it should, duplicate the function of some other class in the suite, seem to be in the wrong location, etc.
- Reviewers perform a dependency analysis — what new dependencies does this introduce? Cyclical, global, temporal, etc. Are these dependencies acceptable?
- Reviewers analyze for code smells and anti-patterns.
- Reviewers compile a list of suggested changes for the developer.
- Meeting adjourned.
- Developer makes suggested changes that he agrees with.
- Developer appeals changes with which he doesn’t agree. This could involve the reviewer(s) providing “proof”, for example if they say that the developer shouldn’t do something because of X negative consequence, they should demonstrate that consequence somehow. This appeal could be resolved via proof/demonstration or it could go to a third party where the developers and reviewers each state their cases.
- Any suggested changes and the results of any appeals are then promoted/committed.
This process naturally addresses (1) and (2) of the things that I don’t like in that you’re having a code review and getting the procedural, easy stuff out of the way offline, prior to meeting. (3) is made more difficult by the fact that the reviewer(s) are given the opportunity to write unit tests that expose the badness about which they might be paranoid. (4) and (5) are addressed by the appeal process and the general concept that changes are suggestions rather than decrees. (6) and (7) are addressed by the mediator who has no skin in the game and will probably have a natural tendency to want to keep things short and sweet.
One drawback I can see to what I’m proposing here is that you could potentially undercut the authority of the reviewer if the person doing the reviews is, say, the most senior or high ranking person. Perhaps people would want that role a bit less if they could be officially second guessed by anyone. However, I think that creates a nice environment where good ideas are valued above all else. If I were in that role myself, I’d welcome the challenge of having to demonstrate/prove ideas that I think are good ones before telling others to adopt them. In the long run, being steered toward that kind of rigor makes you better at your craft. I also think that it might be something of a non-issue, given that people who wind up in these types of leadership and senior roles are often there based on the merit of their work and on successfully “proving” things over and over throughout the course of a career.