Relax, Everyone’s Code Rots

ByErik Dietrich December 6, 2015December 7, 2015

Editorial note: I originally wrote this post for the NDepend blog. Go on over and check out the original. If you’re interested in topics like software department strategy, static analysis, and code metrics, it’ll be up your alley.

I earn my living, or part of it, anyway, doing something very awkward. I get called in to assess and analyze codebases for health and maintainability. As you can no doubt imagine, this tends not to make me particularly popular with the folks who have constructed and who maintain this code. “Who is this guy, and what does he know, anyway?” is a question that they ask, particularly when confronted with the parts of the assessment that paint the code in a less than flattering light. And, frankly, they’re right to ask it.

But in reality, it’s not so much about who I am and what I know as it is about properties of code bases. Are code elements, like types and methods, larger and more complex than those of the average code base? Is there a high degree of coupling and a low degree of cohesion? Are the portions of the code with the highest fan-in exercised by automated tests or are they extremely risky to change? Are there volatile parts of the code base that are touched with every commit? And, for all of these considerations and more, are they trending better or worse?

It’s this last question that is, perhaps, most important. And it helps me answer the question, “who are you and what do you know?” I’m a guy who has run these types of analyses on a lot of codebases and I can see how yours stacks up and where it’s going. And where it’s going isn’t awesome — it’s rotting.

But I’ll come back to that point later.

Communication complexity grows non-linearly.

Imagine that you’re working alone on a project of some sort. You’re certainly going to be bounded by your own productivity, but the communication overhead to whatever you’re doing is essentially nil (unless you’re counting leaving yourself notes and reminders, which I won’t). Now, let’s say it becomes necessary to substantially improve the throughput on this project, so an additional person is added to the mix. Communication is now more of a consideration, but it’s also quite simple. There’s one channel for it and that’s it.

But what happens as the team grows? Once you add a third person, the number of lines of communication goes from 1 to 3: AB, BC, AC. If you add a fourth person, you get another non-linear increase in the number of lines of communication: AB, AC, AD, BC, BD, CD, for a total of 6. If you go to 5, 6, and 7 people, the lines of communication increase to 10, 15, and 21, respectively. Mathematically, this growth makes sense. Each new person coming in adds one line of communication for each already existing person, which is why the lines of communication grow by 2, 3, 4, 5, etc. If you prefer a more mathematically rigorous way to understand this, it’s the idea in discrete mathematics known as combinations.

As the team grows, one person at a time, the amount of communication overhead beings to explode. By the time you have 20 people on the team, there are 190 one on one interactions (to say nothing of situations that call for multiple people). This means that, from a practical perspective, there is a limit on team size beyond which there are diminishing and, eventually, negative returns. The team will eventually do nothing but manage all of these communication channels.

What does this have to do with code? Well, a code base grows in about the same way. It’s just easier for people, particularly non-technical folks, like managers, to wrap their heads around team lines of communication.

Code breaks down the way disorganized collaboration breaks down.

In modern languages, code in a codebase is assembled into some form of logical units or modules. These might be functions, classes, whatever. When there are few of them, life is pretty good and the code is easy to reason about. As the number of these things grows, so too does the complexity, and not linearly. Without any kind of deliberate intervention, codebases suffer the same fate as teams with 20 or 30 or 40 human beings on them all trying to collaborate. Eventually they reach a point where adding to them introduces more problems than it fixes.

How do you prevent this? Well, it’s not easy, and it requires intentionality. This is where I’ll return to the theme of your code rotting. Yes, your code is rotting, but so is almost everyone else’s as well. It’s not an unusual circumstance, and it doesn’t mean that you’ve done anything horribly wrong. It just means that you haven’t yet figured out how to prevent it from rotting.

So, what does it take, in the end? Well, it’s simple… to describe. Put on your managerial hat and ask yourself what would do with a team of 20 or 30 people that was slowed to a crawl by communication overhead. I bet you’d break them into sub-teams with much less communication overhead and have limited, strategic communication between those teams. Maybe this would be reminiscent of how companies organize themselves?

To do this with a codebase requires the same approach, in concept. You minimize the size and complexity of the code components, the way you would with teams. You eliminate unnecessary dependencies in favor of cohesive units. You make sure you have solid backup plans around any high-risk communication bottlenecks and you try to eliminate those whenever possible. And you evaluate the whole thing on a consistent basis to ensure that you’re getting better (or at least not getting worse).

Take-Aways

So in the end, there are two lessons to take away when it comes to your code base. The first is that having a codebase that is rotting with tech debt, while problematic, is not unusual, nor is it a personal failing of yours or your teams. The second is that you need to understand how to manage complexity within your code. The first part is easy. The second part is why code assessments, analysis tools, and coursework on clean code exists in the first place. Because writing clean code takes a lot of work.

Erik Dietrich

Language Agnostic

Addicted to Unit Testing
ByErik Dietrich February 17, 2011September 27, 2012

Something interesting occurred to me the other day when I posted sample code for a DXCore plugin that I created. In the code that I uploaded, I added a unit test project with a few unit tests as a matter of course. Apparently, the process of unit testing has become so ingrained in me that…

Read More Addicted to Unit Testing
Language Agnostic

Static Analysis — Spell Check for Code
ByErik Dietrich February 18, 2011September 27, 2012

A lot of people have caught onto certain programming trends: some agility in the process generally makes things better, unit testing a code base tends to make it more reliable, etc. One thing that, in my experience, seems to lag behind in popularity is the use of static checking tools. If these are used at…

Read More Static Analysis — Spell Check for Code
Language Agnostic

Inverting Control
ByErik Dietrich March 1, 2011October 19, 2014

I imagine that inversion of control is a relatively popular concept to talk or blog about, particularly in object-oriented circles, so rather than do a garden-variety explanation of the term followed by a pitch for using it, I thought I’d take a slightly different approach. I’m going to talk about the reason that there is…

Read More Inverting Control
Language Agnostic

Testable Code is Better Code
ByErik Dietrich August 3, 2011November 14, 2017

It seems pretty well accepted these days that unit testing is preferable to not unit testing. Logically, this implies that most people believe a tested code base is better than a non-tested code base. Further, by the nature of testing, a tested code base is likely to have fewer bugs than a non-tested code base….

Read More Testable Code is Better Code
Language Agnostic

Adventures in Pure Test-Driven Development
ByErik Dietrich October 4, 2011December 22, 2012

In a previous post some time back, I had expressed some skepticism about TDD as a design practice. I talked about test-driven development and its relationship with prototyping and the “make one to throw away” concept. Since I’m not one ever to believe that I’ve arrived at the optimal solution, I’m doing another round of…

Read More Adventures in Pure Test-Driven Development
Reasoning About Code

A Better Metric than Code Coverage
ByErik Dietrich October 6, 2011September 27, 2012

My Chase of Code Coverage Perhaps it’s because fall is upon us and this is the first year in a while that I haven’t been enrolled in a Master’s of CS program (I graduated in May), I’m feeling a little academic. As I mentioned in my last post, I’ve been plowing through following TDD by…

Read More A Better Metric than Code Coverage

10 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

dotnetchris

10 years ago

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations
Melvin Conway, computer scientist 1967 Conway’s Law

Software absolutely looks like the teams that build it. Those integrations of both people and systems are the most important areas of a system. Bad systems are full of adhoc coupling both in code and in person of no semblance in defined structure.

Erik Dietrich

10 years ago

Reply to dotnetchris

Absolutely. I’ve always found Conway’s law to be uncannily accurate.

TheRatiocinator

10 years ago

Very interesting; I didn’t know people did that for a living. Regarding your “lines of communication” calculations, this article could have used an “As Fred Brooks said 40 years ago in ‘The Mythical Man-Month’ ” citation.

Erik Dietrich

10 years ago

Reply to TheRatiocinator

It tends to fall under the general heading of consultative gap analysis. There’s often more to it than just that, but the code assessment is probably the most fun about it. Fair point about Brooks, though, truth be told, that wasn’t in my mind when I was writing the post. I suppose his analysis may have sneaked in subconsciously.

swampwiz0

10 years ago

I was on a (pre-.NET) Visual C++ gig where the architect assigned team members to code up classes whose member function signatures were already defined, and only after a serious need to redefine them were they changed. This worked well as no one’s code broke.

Erik Dietrich

10 years ago

Reply to swampwiz0

That’s not a practice I’ve seen before, at least not at that level (I’ve seen it at the inter-team API contract level). Did it have any adverse effect on the code? One thing that I could see this doing is creating a strong incentive to use global variables.

dotnetchris

10 years ago

Reply to Erik Dietrich

I have to assume the architect in question would have been quite angered by any dev who attempted to cheat by introducing global state to sidestep his architecture.

Erik Dietrich

10 years ago

Reply to dotnetchris

I’ve done a lot of consulting over the last few years, and I never cease to be amazed at people’s capacity for short-sightedness or cognitive dissonance. I’d hope the architect wouldn’t view that as an acceptable work-around, but nothing would shock me.

dotnetchris

10 years ago

Reply to Erik Dietrich

I would definitely give the architect the benefit of the doubt. The architect clearly intended on having a perfectly closed system except for the defined entry points (personally i’d say he was doing SOA internally in C++). Of course the lingering question was did they actually use code reviews to catch if anyone broke the entire architecture. It’s pointless to have an architecture and not actually validate that it’s upheld. The one thing I always have time for is a code review. Building is on fire and this is the code to turn on the sprinkler system? I still have… Read more »

Erik Dietrich

10 years ago

Reply to dotnetchris

Well put, and definitely agreed about code review. I’ve read that it’s the single most effective means for catching problems, even above automated testing. (I think that’s in Code Complete, if memory serves)

Communication complexity grows non-linearly.

Code breaks down the way disorganized collaboration breaks down.

Take-Aways

Similar Posts