Code Review Through the Years
What does a code review workflow look like, in your mind? Naturally, when I ask this question, you’re picturing life in 2015 with current tooling, work arrangements, and productivity tools. So when I say, “code review” you probably picture a whole lot more than peering at source code in some kind of file editor.
Maybe you’re imagining a multi-national, distributed, agile team working at different times of day. The team communicates using tools like Slack and various flavors of instant messenger (and, of course, the ubiquitous email). They no doubt use a sophisticated source control option, augmented with a variety of application management tools. Github comes to mind. And, that’s really just the start. Piggybacking on these capabilities for version controlling and communicating, the team probably employs relatively sophisticated heuristics for making sure all committed code gets a look from one or more other people, and they manage this complex communication with an equally sophisticated mechanism for allowing reviewers to zero in on exactly what has changed.
But perhaps the most beautiful thing about being a programmer in this day and age is that these impressive capabilities are pretty seamlessly managed. You can be sitting in London, where you make a series of simple, but cross-cutting, changes to a number of different source files. Then you can leave for the day, at which point a colleague in New York is notified that you have some changes to review. She can take a quick look, realize that the changes in which you’ve renamed a series of methods are extensive but low risk, and give you a thumbs up. She then kicks it over to another colleague in Los Angeles just for a quick second opinion, and delegates to him to make the final call. Once he takes a look and approves, there’s a mechanism to promote the code to a continuous integration environment, build it, run tests on it, and deploy it to production. Your changes are live the next morning when you come in.
We take this for granted, but, really, we’ve come a long way. Imagine what code reviews might have looked like in past eras of programming.
You’ve most likely heard in the vaguest of historical contexts that “programming was once done on punch cards.” No doubt this evokes a mental image of.
- Feed some 3×5 index cards into some steampunk contraption.
I won’t offer a ton of detail (here’s an interesting account of learning how it worked), but I can describe it briefly. The machine into which the cards were fed had an optical sensor that was smart enough to associate sensing light in certain positions with 1s or 0s in some kind of sequence. So, put most simply, you could use a series of holes in a card to represent, say, the bits in a byte. The task for the programmer, then, was to punch holes in the right positions on the cards and then assemble the cards into the right order to constitute an executable program.
Thus your workflow would be to write (punch) the entirety of your application and then take it somewhere to be compiled and run. Of course, you’d probably have to wait in a queue for an hour to a day before this could happen, but after this wait, you’d get to learn if it compiled and, if so, if it worked.
Code review in this context would be quite different. On the plus side, all of that downtime would leave you with little to do but pore over your deck of cards, triple and quadruple checking your logic, syntax, and even characters. So there was, doubtless, a lot of review. But the frustration level with that epic feedback loop must have been maddening. And you’re not checking code for maintainability — you’re checking it to make sure you don’t lose a workday to a misplaced semicolon.
Up until the 1970s, resource constraints made time sharing on mainframes and computing devices an absolute must. This was why schemes like the punched cards were necessary; computing resources were far too scarce to tie up a mainframe while waiting for someone to type “GOTO” into an editor. Humans could do all of that work up front and let the computer worry only about execution.
But that started to change during the 1970s with the advent of the microcomputer. As people started buying and building on top of these things, they would hand code in assembly language because of resource constraints and the relative dearth of out of the box services. If you’ve never seen it, here’s what a bit of assembly code looks like.
mov eax, $x cmp eax, 0 jne end mov eax, 1 end: inc eax mov $x, eax
This code checks to see if a variable equals zero and, if it is, sets it to 1 before incrementing the variable. It does this with a conceptual use of GOTO (“jne” means “jump if not equal”).
How do you imagine that code review goes with something like this? Sure, the problem of day-long waits to compile and run is gone, but this code is pretty inscrutable. Starting at hundreds of lines of this, littered with gotos and terse syntax, people collaborating on it would probably need a whiteboard just to map out what they think it does. And worrying about readability or maintainability? Probably not so much. Easier just to run it and test to see what happens.
As the cost of resources came down, time sharing on mainframes became easier and microcomputers started to flourish, with operating systems that abstracted away this assembly code. The result was much more recognizable as modern programming to those of us coming of age in the 90’s or later. Languages like C and Smalltalk became tools of the trade for programmers and it was even possible to use programmatic editors to generate the code for compilation.
The abstraction allowed for more complex programs and ones that were easier to reason about. And yet, there were still comparable limitations. The early editors were things like QED, Emacs, and vi. Rather than list everything they’re missing compared to your IDE, I’ll just say that editing with these was like opening up a DOS window and typing source code. And if editing was cumbersome, you can imagine what performing code reviews was like — you’d have to simply read the source code verbatim and rely on the author’s recollection of what had changed.
Early Source Control
Weird as it seems, up until the 90s (and much, much later in some places), source control wasn’t really a thing. There weren’t as many people on teams and they weren’t generating as much code, so having some sort of ad-hoc system for keeping track of the code on disk was fine. And even as source control did emerge, early tools included CVS and Visual Source Safe where branching and merging were horrific and impossible, respectively.
The result was that reasoning about code bases in terms of incremental changes and refactorings was simply not common. It was possible to audit the history of a particular file, but it was really not feasible to look at semantic differences in codebases across change sets. As a result, code review to be more of a group activity around the entire code work product prior to release than a fast-feedback, peer check. Review was more akin to a component of the “testing phase” in a waterfall process.
You may think that it’s interesting to look back on a handful of snapshots of the history of our profession but wonder why I’m mentioning this. What’s the point? Well, it’s to point out what a unique opportunity you have. Our industry has been building toward the multi-national scenario I laid out for more than half a century. Don’t squander that. Avail yourself of the tools to maximize code review effectiveness.