Stories about Software


The Relationship Between Team Size and Code Quality

Editorial Note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, take a look at NDepend and see how your code stacks up.

Over the last few years, I’ve had occasion to observe lots of software teams.  These teams come in all shapes and sizes, as the saying goes.  And, not surprisingly, they produce output that covers the entire spectrum of software quality.

It would hardly make headline news to cite team members’ collective skill level and training as a prominent factor in determining quality level.  But what else affects it?  Does team size?  Recently, I found myself pondering this during a bit of downtime ahead of a meeting.

Does some team size optimize for quality?  If so, how many people belong on a team?  This line of thinking led me to consider my own experience for examples.

A Case Study in Large Team Dysfunction

Years and years ago, I spent some time with a large team.  For its methodology, this shop unambiguously chose waterfall, though I imagine that, like many such shops, they called it “SDLC” or something like that.  But any way you slice it, requirements and design documents preceded the implementation phase.

In addition to big up front planning, the codebase had little in the way of meaningful abstractions to partition it architecturally.  As a result, you had the perfect incubator for a massive team.  The big up front planning ensured the illusion of appropriate resource allocation.  And then the tangled code base ensured that “illusion” was the best way to describe the notion that people could be assigned tasks where they didn’t severely impact one another much.

On top of all that, the entire software organization had a code review mandate for compliance purposes.  This meant that someone needed to review each line of code committed.  And, absent a better scheme, this generally meant that the longest tenured team members did the reviewing.  The same longest tenured team members that had created an architecture with no meaningful partitioning abstractions.

This cauldron of circumstances boiled up a mess.  Team members bickered over minutiae as code sprawled, rotted, and tangled.  Based solely on this experience, less is more.

A Case Study in Large Team Harmony

But then I thought of something else I saw, years later.  At the time I had taken to cooling my heels in a really large enterprise, helping the various software departments and programs with a push for software craftsmanship.  This meant leveraging my experience to teach them things like test driven development, continuous integration, constant refactoring, etc.

Usually, the delivery mechanism for this sort of thing had two components: group workshops and individual practice (I’d pair with them).  The group workshops often involved a presentation and then group practice using a technique called Randori.  This involved a pair of people participating in the coding while the rest of the group observed.  Every so often, one of the pair would head for the peanut gallery, and someone new would take the wheel.

We could generalize Randori to the idea of “mob programming,” wherein an entire team or group of people all work on the same problem.  And sometimes teams at this client site did exactly that, either while trying to grok a new technique or while working on a particularly difficult problem.

Teams taking this approach can seem comically over-allocated.  8 or 10 people spend the entire day working on a single class or even a single method.  Setting aside efficiency considerations, though, these teams produced code with excellent quality.

It’s All About the Interactions

So what gives?  We have two teams with high developer to code ratios.  One of them produces a defect factory while the other produces high quality code.

We could posit that waterfall versus agile makes the difference, but I don’t buy it.  Those methodologies deal more with adapting to change and feedback loops than with the actual makeup of the source code.  And, besides, the mobbing teams were transitioning to an agile approach, so it’s less cut and dried.  We could also posit different skill levels for the teams, but on paper, the first team averaged more experience by far than the second.

The answer comes from the nature of the interactions.  The first team each went off and worked in a vacuum for weeks or months at a time.  Then, they circled back to bicker over details at code review time, with duration of tenure serving as the ultimate dispute arbiter.

The second team came together without a set concept of roles.  And from this egalitarian footing, they hashed out philosophical disagreements the moment they first cropped up.  This didn’t allow the team members weeks or months to become attached to and defensive of their ideas.

So at one end of the quality spectrum, we have individual contributor silos and politically charged forums for bringing them together.  At the opposite end, we have relatively minimized group politics and a forum for allowing the best ideas to bubble quickly to the top.

Code Quality as the Team Scales

Let’s now engage in a bit of inductive reasoning.  Consider a single person team, where that single person has years of experience writing high quality code.  One might argue for this as an optimal team size for quality, given that one person’s skill.

But let’s say that reality presents an unwelcome intrusion.  One person can’t deliver according to the necessary schedule, so the project requires more team members.  Inductively speaking, we can preserve the high quality of team output under the following two conditions.

  • We hire reasonably skilled team members.
  • We create productive collaboration models.

If those hold true, more folks means more quality, as demonstrated by the mob programming story.  More people collaborating means more eyes to catch mistakes and more minds with more chances of generating the best idea.

The Answer

After all of this analysis about team size, I can’t help but bring cost into the discussion to close out.  After all, whoever holds the purse strings will have a lot to say about “optimal team size.”

Your team’s code quality will only be as good as the skill levels of the team members and the productiveness of their interactions.  So, if cost didn’t matter, the answer for optimal team size would be, “define productive collaborations, and then the more the better.”  Get enough skilled developers to cover the needed functionality, then keep adding.  If you run out of “room” have them pair.  If the pairs run out of room, have them “triple.”  And so on.

But then cost enters the discussion.  You can’t hire every developer on Earth (and, in reality, mobbing would eventually reach some point of diminishing returns), because you will ultimately have limited budget.  Instead, establish good collaboration practices, and then hire as many skilled, compatible folks as budget allows.

To put it more succinctly, the optimal team size vis a vis code quality is “as many people that work well together as you can afford.”

Newest Most Voted
Inline Feedbacks
View all comments
Adrian Johnston
Adrian Johnston
6 years ago

Beautifully summarised!

6 years ago

Interesting observations, but all the way through reading this article, I kept wondering ‘how do you measure quality’. This can be a very subjective thing. More illumination of this concept might help the meaning of this article.

Marc Clifton
Marc Clifton
6 years ago

There’s a lot of articles out there about how code quality is subjective, hard, if not impossible, to measure, etc. So when you write ” these teams produced code with excellent quality” how was quality measured? I’m truly curious about this!

Dilton Dalton
Dilton Dalton
6 years ago

I realize that most people don’t want to complicate computing by introducing mathematics, but do you have any stats to back up your diatribe against the waterfall approach (sans a well segmented architecture)? There are many ways to screw up any SDLC (Software Development Life Cycle is not just for waterfalls) and your comments are anecdotal at best. How good do you expect the quality of the software to be when one of the developers has a masters degree in Eastern European Folk Dancing? Or when one if the developers is a Physicist?

Dave McCraw
Dave McCraw
6 years ago
Reply to  Dilton Dalton

This comment made me smile, because what is computing except mathematics?

Dave McCraw
Dave McCraw
6 years ago

FWIW I agree with your conclusion. I wonder how many companies actually optimise for effectiveness of the team as a whole as opposed to a dubious proxy measure (eg maximising individual utilisation of engineers). Based on my experience of large orgs, things which are easy to measure are often optimised regardless of true value. re SDLC, agile hopefully optimises for what stakeholders actually want, better than does waterfall. But unless quality is explicitly valued by stakeholders and some reasonable metric is in place, I’m not sure that either method necessarily leads to higher quality than the other. Aggressively iterating on… Read more »

Blaine Osepchuk
6 years ago

Hey Erik, I’m curious: do you think there’s a limit to the maximum size of an effective team producing high quality software? What’s the largest effective team you’ve seen?