DaedTech

Stories about Software

By

Let’s Make Better Code Metrics

A few months back, I wrote a post about changes to my site and work.  Today, I have another announcement in that same vein:  I’ve recently partnered with NDepend to help start and create content for their blog.  If you go there now, you can see the maiden post which announces the release of the newest version of NDepend (not written by me, personally, if you were wondering, though some of mine will follow).

What is NDepend?

In the broadest terms, NDepend is a static analysis tool.  More specifically and colloquially, you might think of NDepend as Jiminy Cricket, if Pinocchio were a software developer or architect.  It’s extremely helpful for visualizing the dependencies and properties of your code base (e.g. complexity, coupling, etc), which will give you a leg up on your fellow developers right out of the gate.  It’s also incredibly informative, furnishing you not only with detailed, quantitative metrics about your code, but also indicating where you’re deviating from what is broadly considered to be good programming technique.  And, of course, you can do a great deal of customization, from integrating this feedback into your build to tracking code quality over time to building and defining your own complex, custom rules.

To put it more succinctly, in a world where developers are trying to distinguish themselves in terms of knowledge and chops, NDepend will give you such a huge advantage that it’s probably unfair to everyone that doesn’t have it.  I personally learned a ton about software architecture just from installing, using, and exploring this tool over the course of 5 years or so.  If you want to learn more about NDepend and static analysis in general, check out my Pluralsight course about it that I published in conjunction with the last major version. (If you don’t have a Pluralsight subscription but want to check it out, sign up for my mailing list using the form at the right).

Scientist Read More

By

Introduction to Static Analysis (A Teaser for NDepend)

Rather than the traditional lecture approach of providing an official definition and then discussing the subject in more detail, I’m going to show you what static analysis is and then define it. Take a look at the following code and think for a second about what you see. What’s going to happen when we run this code?

private void SomeMethod()
{
	int x = 1;
	if(x == 1)
		throw new Exception();
}

Well, let’s take a look:

Exception

I bet you saw this coming. In a program that does nothing but set x to 1, and then throw an exception if x is 1, it isn’t hard to figure out that the result of running it will be an unhandled exception. What you just did there was static analysis.

Static analysis comes in many shapes and sizes. When you simply inspect your code and reason about what it will do, you are performing static analysis. When you submit your code to a peer to have her review, she does the same thing. Like you and your peer, compilers perform static analysis, though automated analysis instead of manual. They check the code for syntax errors or linking errors that would guarantee failures, and they will also provide warnings about potential problems such as unreachable code or assignment instead of evaluation. Products also exist that will check your source code for certain characteristics and stylistic guideline conformance rather than worrying about what happens at runtime and, in managed languages, products exist that will analyze your compiled IL or byte code and check for certain characteristics. The common thread here is that all of these examples of static analysis involve analyzing your code without actually executing it.

Analysis vs Reactionary Inspection

People’s interactions with their code tend to gravitate away from analysis. Whether it’s unit tests and TDD, integration tests, or simply running the application to see what happens, programmers tend to run experiments with their code and then to see what happens. This is known as a feedback loop, and programmers use the feedback to guide what they’re going to do next. While obviously some thought is given to what impact changes to the code will have, the natural tendency is to adopt an “I’ll believe it when I see it” mentality.

private void SomeMethod()
{
	var randomGenerator = new Random();
	int x = randomGenerator.Next(1, 10);
	Console.WriteLine(x);
}

We tend to ask “what happened?” and we tend to orient our code in such ways as to give ourselves answers to that question. In this code sample, if we want to know what happened, we execute the program and see what prints. This is the opposite of static analysis in that nobody is trying to reason about what will happen ahead of time, but rather the goal is to do it, see what the outcome is, and then react as needed to continue.

Reactionary inspection comes in a variety of forms, such as debugging, examining log files, observing the behavior of a GUI, etc.

Static vs Dynamic Analysis

The conclusions and decisions that arise from the reactionary inspection question of “what happened” are known as dynamic analysis. Dynamic analysis is, more formally, inspection of the behavior of a running system. This means that it is an analysis of characteristics of the program that include things like how much memory it consumes, how reliably it runs, how much data it pulls from the database, and generally whether it correctly satisfies the requirements are not.

Assuming that static analysis of a system is taking place at all, dynamic analysis takes over where static analysis is not sufficient. This includes situations where unpredictable externalities such as user inputs or hardware interrupts are involved. It also involves situations where static analysis is simply not computationally feasible, such as in any system of real complexity.

As a result, the interplay between static analysis and dynamic analysis tends to be that static analysis is a first line of defense designed to catch obvious problems early. Besides that, it also functions as a canary in the mine to detect so-called “code smells.” A code smell is a piece of code that is often, but not necessarily, indicative of a problem. Static analysis can thus be used as an early detection system for obvious or likely problems, and dynamic analysis has to be sufficient for the rest.

Canary

Source Code Parsing vs. Compile-Time Analysis

As I alluded to in the “static analysis in broad terms” section, not all static analysis is created equal. There are types of static analysis that rely on simple inspection of the source code. These include the manual source code analysis techniques such as reasoning about your own code or doing code review activities. They also include tools such as StyleCop that simply parse the source code and make simple assertions about it to provide feedback. For instance, it might read a code file containing the word “class” and see that the next word after it is not capitalized and return a warning that class names should be capitalized.

This stands in contrast to what I’ll call compile time analysis. The difference is that this form of analysis requires an encyclopedic understanding of how the compiler behaves or else the ability to analyze the compiled product. This set of options obviously includes the compiler which will fail on show stopper problems and generate helpful warning information as well. It also includes enhanced rules engines that understand the rules of the compiler and can use this to infer a larger set of warnings and potential problems than those that come out of the box with the compiler. Beyond that is a set of IDE plugins that perform asynchronous compilation and offer realtime feedback about possible problems. Examples of this in the .NET world include Resharper and CodeRush. And finally, there are analysis tools that look at the compiled assembly outputs and give feedback based on them. NDepend is an example of this, though it includes other approaches mentioned here as well.

The important compare-contrast point to understand here is that source analysis is easier to understand conceptually and generally faster while compile-time analysis is more resource intensive and generally more thorough.

The Types of Static Analysis

So far I’ve compared static analysis to dynamic and ex post facto analysis and I’ve compared mechanisms for how static analysis is conducted. Let’s now take a look at some different kinds of static analysis from the perspective of their goals. This list is not necessarily exhaustive, but rather a general categorization of the different types of static analysis with which I’ve worked.

  • Style checking is examining source code to see if it conforms to cosmetic code standards
  • Best Practices checking is examining the code to see if it conforms to commonly accepted coding practices. This might include things like not using goto statements or not having empty catch blocks
  • Contract programming is the enforcement of preconditions, invariants and postconditions
  • Issue/Bug alert is static analysis designed to detect likely mistakes or error conditions
  • Verification is an attempt to prove that the program is behaving according to specifications
  • Fact finding is analysis that lets you retrieve statistical information about your application’s code and architecture

There are many tools out there that provide functionality for one or more of these, but NDepend provides perhaps the most comprehensive support across the board for different static analysis goals of any .NET tool out there. You will thus get to see in-depth examples of many of these, particularly the fact finding and issue alerting types of analysis.

A Quick Overview of Some Example Metrics

Up to this point, I’ve talked a lot in generalities, so let’s look at some actual examples of things that you might learn from static analysis about your code base. The actual questions you could ask and answer are pretty much endless, so this is intended just to give you a sample of what you can know.

  • Is every class and method in the code base in Pascal case?
  • Are there any potential null dereferences of parameters in the code?
  • Are there instances of copy and paste programming?
  • What is the average number of lines of code per class? Per method?
  • How loosely or tightly coupled is the architecture?
  • What classes would be the most risky to change?

Believe it or not, it is quite possible to answer all of these questions without compiling or manually inspecting your code in time consuming fashion. There are plenty of tools out there that can offer answers to some questions like this that you might have, but in my experience, none can answer as many, in as much depth, and with as much customizability as NDepend.

Why Do This?

So all that being said, is this worth doing? Why should you watch the subsequent modules if you aren’t convinced that this is something that’s even worth learning. It’s a valid concern, but I assure you that it is most definitely worth doing.

  • The later you find an issue, typically, the more expensive it is to fix. Catching a mistake seconds after you make it, as with a typo, is as cheap as it gets. Having QA catch it a few weeks after the fact means that you have to remember what was going on, find it in the debugger, and then figure out how to fix it, which means more time and cost. Fixing an issue that’s blowing up in production costs time and effort, but also business and reputation. So anything that exposes issues earlier saves the business money, and static analysis is all about helping you find issues, or at least potential issues, as early as possible.
  • But beyond just allowing you to catch mistakes earlier, static analysis actually reduces the number of mistakes that happen in the first place. The reason for this is that static analysis helps developers discover mistakes right after making them, which reinforces cause and effect a lot better. The end result? They learn faster not to make the mistakes they’d been making, causing fewer errors overall.
  • Another important benefit is that maintenance of code becomes easier. By alerting you to the presence of “code smells,” static analysis tools are giving you feedback as to which areas of your code are difficult to maintain, brittle, and generally problematic. With this information laid bare and easily accessible, developers naturally learn to avoid writing code that is hard to maintain.
  • Exploratory static analysis turns out to be a pretty good way to learn about a code base as well. Instead of the typical approach of opening the code base in an IDE and poking around or stepping through it, developers can approach the code base instead by saying “show me the most heavily used classes and which classes use them.” Some tools also provide visual representations of the flow of an application and its dependencies, further reducing the learning curve developers face with a large code base.
  • And a final and important benefit is that static analysis improves developers’ skills and makes them better at their craft. Developers don’t just learn to avoid mistakes, as I mentioned in the mistake reduction bullet point, but they also learn which coding practices are generally considered good ideas by the industry at large and which practices are not. The compiler will tell you that things are illegal and warn you that others are probably errors, but static analysis tools often answer the question “is this a good idea.” Over time, developers start to understand subtle nuances of software engineering.

There are a couple of criticisms of static analysis. The main ones are that the tools can be expensive and that they can create a lot of “noise” or “false positives.” The former is a problem for obvious reasons and the latter can have the effect of counteracting the time savings by forcing developers to weed through non-issues in order to find real ones. However, good static analysis tools mitigate the false positives in various ways, an important one being to allow the shutting off of warnings and the customization of what information you receive. NDepend turns out to mitigate both: it is highly customizable and not very expensive.

Reference

The contents of this post were mostly taken from a Pluralsight course I did on static analysis with NDepend. Here is a link to that course. If you’re not a Pluralsight subscriber but are interested in taking a look at the course or at the library in general, send me an email to erik at daedtech and I can give you a 7 day trial subscription.

By

Static Analysis, NDepend, and a Pluralsight Course

I absolutely love statistics. Not statistics as in the school subject — I don’t particularly love that branch of mathematics with its binomial distributions and standard deviations and whatnot. I once remarked to a friend in college that statistics-the-subject seemed like the ‘science’ of taking a guess and then rigorously figuring out how wrong you were. Flippant as that assessment may have been, statistics-the subject has hardly the elegant smoothness of calculus or the relentlessly logical pursuit of discrete math. Not that it isn’t interesting at all — to a math geek like me, it’s all good — but it just isn’t really tops on my list.

But what is fascinating to me is tabulating outcomes and gamification. I love watching various sporting events on television and keep track of odd things. When watching a basketball game, I always the amount of a “run” the teams are on before the announcers think to say something like “Chicago is on a 15-4 run over the last 6:33 this quarter.” I could have told you that. In football, if the quarterback is approaching a fist half passing record, I’m calculating the tally mentally after every play and keeping track. Heck, I regularly watch poker on television not because of the scintillating personalities at the tables but because I just like seeing what cards come out, what hands win, and whether the game is statistically normal or aberrant. This extends all the way back to my childhood when things like my standardized test scores and my class rank were dramatically altered by me learning that someone was keeping score and ranking them.

I’m not sure what it is that drives this personality quirk of mine, but you can imagine what happened some years back when I discovered static analysis and then NDepend. I was hooked. Before I understood what the Henderson Sellers Lack of Cohesion in Methods score was, I knew that I wanted mine to be lower than other people’s. For those of you not familiar, static analysis is a way to examine your code without actually executing it and seeing what happens retroactively. Static analysis, (over) simplified, is an activity that examines your source code and makes educated guesses about how it will behave at runtime and beyond (i.e. maintenance). NDepend is a tool that performs static analysis at a level and with an amount of detail that makes it, in my opinion, the best game in town.

After overcoming an initial pointless gamification impulse, I learned to harness it instead. I read up on every metric under the sun and started to understand what high and low scores correlated with in code bases. In other words, I studied properties of good code bases and bad code bases, as described by these metrics, and started to rely on my own extreme gamification tendencies in order to drive my work toward better code. It wasn’t just a matter of getting in the habit of limiting my methods to the absolute minimum in size or really thinking through the coupling in my code base. I started to learn when optimizing to improve one metric led to a decline in another — I learned lessons about design tradeoffs.

It was this behavior of seeking to prove myself via objective metrics that got me started, but it was the ability to ask and answer lots of questions about my code base that kept me coming back. I think that this is the real difference maker when it comes NDepend, at least for me. I can ask questions, and then I can visualize, chart and track the answer in just about every conceivable way. I have a “Moneyball” approach to code, and NDepend is like my version of the Jonah Hill character in that movie.

Because of my high opinion of this tool and its importance in the lives of developers, I made a Pluralsight course about it. If you have a subscription and have any interest in this subject at all, I invite you to check it out. If you’re not familiar with the subject, I’d say that if your interest in programming breaks toward architecture — if you’re an architect or an aspiring architect — you should also check it out. Static analysis will give you a huge leg up on your competition for architect roles, and my course will provide an introduction for getting started. If you don’t have a Pluralsight subscription, I highly recommend trying one out and/or getting one. This isn’t just a plug for me to sell a course I’ve made, either. I was a Pluralsight subscriber and fan before I ever became an author.

If you get a chance to check it out, I hope you enjoy.