DaedTech

Stories about Software

By

Fixing Your Snarled Dependency Graph

Editorial note: I originally wrote this post for the NDepend blog.  You can check out the original here, at their site.  While you’re there, have a look at NDepend’s features for helping you visualize your codebase.

I’ve written before about making use of NDepend’s dependency graph.  Well, indirectly, anyway.  In that post, I talked about the phenomenon of actual software architecture not matching the pretty diagrams people draw in Visio.  It reminds me of Helmuth von Moltke’s wisdom that no battle plan survives contact with the enemy.

Typically, architects conceive of wondrous, clean, and decoupled systems.  Then they immortalize this pristine architecture in Visio.  Naturally, print outs go up on the wall, and everyone knows what the system should look like.  But somehow, it never actually winds up looking like that.

Architectures of Despair

I think we all know what it winds up looking like.  Or, at least, what it can look like.  Sometimes the actual architecture only misses the mark by a little, around the edges.  But other times, it goes sailing off in the wrong direction, like a disastrous misfire at the archery range.

When this happens, we have metaphors for the result.  Work in the industry long enough, and you’ll hold your nose and describe a codebase as a big ball of mud.  You might also hear descriptors involving tangled Christmas tree lights or spaghetti code.  Maybe you’ll hear about a bramble bush or something.

The specific image varies, but the properties do not.  All of them describe something snarled, difficult to separate, and unpleasant to work with.  They indicate complexity without intent or purpose.  And when that happens, deadlines slip and defects proliferate.  Oh, and the people working in the codebase become miserable, now regarding those Visio diagrams as cruel jokes.

All of this stems from a core problem: a tangled dependency graph.

The Nasty Dependency Graph Origin Story

How do things get this way?  When you start the project, everything makes sense.  You’ll have four application layers.  Oh, and a utils project.  I mean, every codebase needs one of those, so that kind of goes without saying.  But that’s it.  Simple as can be.

After a while, though, things get a little more complex.  At some point, you can’t just have a single project for each layer, so you do a little cell division within each one.  Now most of the projects within a given layer point to other projects within that layer.  Oh, and it looks like the top layer actually refers to stuff two and three layers below it, and not just the layer directly below it.  Oops.  Someone should have caught that.

Oh, and the utils project has grown pretty unwieldy since we tend to throw everything and the kitchen sink in it.  Let’s split that thing up into like five utils projects.  And, let’s pull a logger out of there and have it stand on its own, since that’s really kind of a separate concern.  But let’s have the logger use one or two of the utils projects, since they have handy stuff in them.

As time goes by, it gets harder and harder for people to keep track of where everything is.  Except for a few super-engineers there since the beginning.  They know where to find everything.  So they do a lot of code reviews.  They say things like, “you’re reinventing the wheel — just use that thing that Alice made once in module XYZ.”  And the newbies do exactly that, adding a quick little dependency instead of duplicating code.

The growth of every bramble and the addition of every piece of spaghetti makes sense.  Until, one day nothing makes sense.

It’s a Long Way Back from a Bad Dependency Graph

You’ve certainly experienced this intuitively, if not explicitly.  As you work in a codebase, you can add dependencies easily.  Find some method that looks helpful and use it.  If you get a compiler error, your IDE will probably happily fix it for you by adding the dependency.  Ship it.

You add the new method call or make use of the class in question.  Maybe you hold your nose and add a reference to a global variable.  In that moment, you add dependency snarl.  If you imagine your classes and modules as nodes in a directed graph, you’ve just added an edge.  Just like that, you solve your problem and just like that, your dependency graph complexity grows.

Once you blaze that trail, others follow.  People use your code and the dependency’s magnitude grows.  Others working in the module emulate your solution, and the dependency’s thickness grows.  At the moment of creation, you could have removed it with a moment’s effort.  Weeks later, you can only remove it by reminding yourself of what you did and then consulting with various teammates and even folks from other teams.  The cement hardens, so to speak.

So when you get to the breaking point with your overall architecture, you have quite the task in front of you.  Now you’re talking about dozens or hundreds of hardened dependencies.  Pruning them away becomes not a so-called refactoring, but a first class project.

How, then, should you tackle it?

Find and Break Cycles

Before you do anything else, you should identify and break any dependency cycles.  When you break a project into modules or a module into classes, you do so to isolate components and let them vary separately.  In other words, you decide to live with the complexity of multiple components for the simplicity of easier change of any component.

But cycles destroy this trade-off.  They turn your code into a conceptual monolith anyway, making it inseparable.  So you wind up with the worst of both worlds.  Break up any dependency cycles as the first step on your road to recovery.

Categorize Your Dependencies

Dependency cycles create cut and dried situations that you should fix.  But, after that, things get fuzzier.  After all, you have too many dependencies, but you need at least some.  How do you tell the “good” ones from the “bad” ones?

The answer involves more nuance than I can give you in a blog post.  You’ll have to figure that out for your specific situations.  But go about it intentionally.  First identify dependencies mandated by your architecture.  Then identify any that seem to make sense or be unavoidable.  These go in the keep column.

On the other hand, you’ll certainly find some created on the fly, for no particularly compelling reason.  You might even find some with commit or code comments suggesting the dependency represents a hack.  That should go first on your hit list.  Look also for dependencies rendered obsolete.  These go in the eliminate column.

Thin, Then Eliminate

Once you’ve categorized the dependencies, you can act accordingly.  In rare cases, you may have a completely obsolete dependency that you can simply delete by removing an assembly reference.  Or maybe just one piece of code takes an awkward dependency.  You can rework that code and then eliminate the dependency, thus simplifying your dependency graph.

Think of those as low hanging fruit and go after them.  But you’ll still have work to do for more entrenched dependencies.  Follow the wisdom about not building Rome in a day and work over time to thin these dependencies.  Eliminate instances methodically until they become obsolete or single dependencies.  Then you know what to do.

Create NuGet Packages

Working your way back toward a simpler dependency graph generally means eliminating haphazardly created dependencies in favor of deliberate ones.  But you have another category of action as well.  You can identify areas of your codebase stable and functional enough to become their own codebases.

Remember the logger I mentioned?  Or the utils assembly?  (I would actually recommend not having utils assemblies, but that’s a subject for another time.)  In many large codebases, people will create those things initially and then leave them largely untouched.  Their complexity still exists in the dependency graph, but it doesn’t necessarily have to.

You could consider turning them essentially into third party libraries by using something like NuGet.  You still own and use them, but they become external to your core codebase and its dependency graph.

Prevention is the Best Medicine

No magic bullets exist for fixing a dependency graph.  It’s slow, painful, and unpleasant work.

Think of the last time you used that really long extension cord.  You were pressed for time, so you just wadded it up and stuffed it into the closet.  Now you want to use it again.  Well, you’re going to have to put in the time working laboriously through the tangles and knots.

Your codebase and architecture are the cord in this corny little parable.  A little investment in maintenance now saves you a lot of pain later.  Make sure that you keep your actual dependency graph aligned with your architectural vision for it.  You can measure this easily with NDepend and stop any extraneous dependencies right at the moment of creation, before they can calcify in your codebase.

I’ve seen tons of codebases with snarled dependency graphs.  But I’ve yet to see one with a snarled dependency graph when key team members routinely examined the actual dependency graph.