Erik Dietrich, Author at DaedTech

May
11

Introduction to Static Analysis (A Teaser for NDepend)

Category: .NET Tags: C#, NDepend, Static Analysis |

Rather than the traditional lecture approach of providing an official definition and then discussing the subject in more detail, I’m going to show you what static analysis is and then define it. Take a look at the following code and think for a second about what you see. What’s going to happen when we run this code?

private void SomeMethod()
{
	int x = 1;
	if(x == 1)
		throw new Exception();
}

Well, let’s take a look:

I bet you saw this coming. In a program that does nothing but set x to 1, and then throw an exception if x is 1, it isn’t hard to figure out that the result of running it will be an unhandled exception. What you just did there was static analysis.

Static analysis comes in many shapes and sizes. When you simply inspect your code and reason about what it will do, you are performing static analysis. When you submit your code to a peer to have her review, she does the same thing. Like you and your peer, compilers perform static analysis, though automated analysis instead of manual. They check the code for syntax errors or linking errors that would guarantee failures, and they will also provide warnings about potential problems such as unreachable code or assignment instead of evaluation. Products also exist that will check your source code for certain characteristics and stylistic guideline conformance rather than worrying about what happens at runtime and, in managed languages, products exist that will analyze your compiled IL or byte code and check for certain characteristics. The common thread here is that all of these examples of static analysis involve analyzing your code without actually executing it.

Analysis vs Reactionary Inspection

People’s interactions with their code tend to gravitate away from analysis. Whether it’s unit tests and TDD, integration tests, or simply running the application to see what happens, programmers tend to run experiments with their code and then to see what happens. This is known as a feedback loop, and programmers use the feedback to guide what they’re going to do next. While obviously some thought is given to what impact changes to the code will have, the natural tendency is to adopt an “I’ll believe it when I see it” mentality.

private void SomeMethod()
{
	var randomGenerator = new Random();
	int x = randomGenerator.Next(1, 10);
	Console.WriteLine(x);
}

We tend to ask “what happened?” and we tend to orient our code in such ways as to give ourselves answers to that question. In this code sample, if we want to know what happened, we execute the program and see what prints. This is the opposite of static analysis in that nobody is trying to reason about what will happen ahead of time, but rather the goal is to do it, see what the outcome is, and then react as needed to continue.

Reactionary inspection comes in a variety of forms, such as debugging, examining log files, observing the behavior of a GUI, etc.

Static vs Dynamic Analysis

The conclusions and decisions that arise from the reactionary inspection question of “what happened” are known as dynamic analysis. Dynamic analysis is, more formally, inspection of the behavior of a running system. This means that it is an analysis of characteristics of the program that include things like how much memory it consumes, how reliably it runs, how much data it pulls from the database, and generally whether it correctly satisfies the requirements are not.

Assuming that static analysis of a system is taking place at all, dynamic analysis takes over where static analysis is not sufficient. This includes situations where unpredictable externalities such as user inputs or hardware interrupts are involved. It also involves situations where static analysis is simply not computationally feasible, such as in any system of real complexity.

As a result, the interplay between static analysis and dynamic analysis tends to be that static analysis is a first line of defense designed to catch obvious problems early. Besides that, it also functions as a canary in the mine to detect so-called “code smells.” A code smell is a piece of code that is often, but not necessarily, indicative of a problem. Static analysis can thus be used as an early detection system for obvious or likely problems, and dynamic analysis has to be sufficient for the rest.

Source Code Parsing vs. Compile-Time Analysis

As I alluded to in the “static analysis in broad terms” section, not all static analysis is created equal. There are types of static analysis that rely on simple inspection of the source code. These include the manual source code analysis techniques such as reasoning about your own code or doing code review activities. They also include tools such as StyleCop that simply parse the source code and make simple assertions about it to provide feedback. For instance, it might read a code file containing the word “class” and see that the next word after it is not capitalized and return a warning that class names should be capitalized.

This stands in contrast to what I’ll call compile time analysis. The difference is that this form of analysis requires an encyclopedic understanding of how the compiler behaves or else the ability to analyze the compiled product. This set of options obviously includes the compiler which will fail on show stopper problems and generate helpful warning information as well. It also includes enhanced rules engines that understand the rules of the compiler and can use this to infer a larger set of warnings and potential problems than those that come out of the box with the compiler. Beyond that is a set of IDE plugins that perform asynchronous compilation and offer realtime feedback about possible problems. Examples of this in the .NET world include Resharper and CodeRush. And finally, there are analysis tools that look at the compiled assembly outputs and give feedback based on them. NDepend is an example of this, though it includes other approaches mentioned here as well.

The important compare-contrast point to understand here is that source analysis is easier to understand conceptually and generally faster while compile-time analysis is more resource intensive and generally more thorough.

The Types of Static Analysis

So far I’ve compared static analysis to dynamic and ex post facto analysis and I’ve compared mechanisms for how static analysis is conducted. Let’s now take a look at some different kinds of static analysis from the perspective of their goals. This list is not necessarily exhaustive, but rather a general categorization of the different types of static analysis with which I’ve worked.

Style checking is examining source code to see if it conforms to cosmetic code standards
Best Practices checking is examining the code to see if it conforms to commonly accepted coding practices. This might include things like not using goto statements or not having empty catch blocks
Contract programming is the enforcement of preconditions, invariants and postconditions
Issue/Bug alert is static analysis designed to detect likely mistakes or error conditions
Verification is an attempt to prove that the program is behaving according to specifications
Fact finding is analysis that lets you retrieve statistical information about your application’s code and architecture

There are many tools out there that provide functionality for one or more of these, but NDepend provides perhaps the most comprehensive support across the board for different static analysis goals of any .NET tool out there. You will thus get to see in-depth examples of many of these, particularly the fact finding and issue alerting types of analysis.

A Quick Overview of Some Example Metrics

Up to this point, I’ve talked a lot in generalities, so let’s look at some actual examples of things that you might learn from static analysis about your code base. The actual questions you could ask and answer are pretty much endless, so this is intended just to give you a sample of what you can know.

Is every class and method in the code base in Pascal case?
Are there any potential null dereferences of parameters in the code?
Are there instances of copy and paste programming?
What is the average number of lines of code per class? Per method?
How loosely or tightly coupled is the architecture?
What classes would be the most risky to change?

Believe it or not, it is quite possible to answer all of these questions without compiling or manually inspecting your code in time consuming fashion. There are plenty of tools out there that can offer answers to some questions like this that you might have, but in my experience, none can answer as many, in as much depth, and with as much customizability as NDepend.

Why Do This?

So all that being said, is this worth doing? Why should you watch the subsequent modules if you aren’t convinced that this is something that’s even worth learning. It’s a valid concern, but I assure you that it is most definitely worth doing.

The later you find an issue, typically, the more expensive it is to fix. Catching a mistake seconds after you make it, as with a typo, is as cheap as it gets. Having QA catch it a few weeks after the fact means that you have to remember what was going on, find it in the debugger, and then figure out how to fix it, which means more time and cost. Fixing an issue that’s blowing up in production costs time and effort, but also business and reputation. So anything that exposes issues earlier saves the business money, and static analysis is all about helping you find issues, or at least potential issues, as early as possible.
But beyond just allowing you to catch mistakes earlier, static analysis actually reduces the number of mistakes that happen in the first place. The reason for this is that static analysis helps developers discover mistakes right after making them, which reinforces cause and effect a lot better. The end result? They learn faster not to make the mistakes they’d been making, causing fewer errors overall.
Another important benefit is that maintenance of code becomes easier. By alerting you to the presence of “code smells,” static analysis tools are giving you feedback as to which areas of your code are difficult to maintain, brittle, and generally problematic. With this information laid bare and easily accessible, developers naturally learn to avoid writing code that is hard to maintain.
Exploratory static analysis turns out to be a pretty good way to learn about a code base as well. Instead of the typical approach of opening the code base in an IDE and poking around or stepping through it, developers can approach the code base instead by saying “show me the most heavily used classes and which classes use them.” Some tools also provide visual representations of the flow of an application and its dependencies, further reducing the learning curve developers face with a large code base.
And a final and important benefit is that static analysis improves developers’ skills and makes them better at their craft. Developers don’t just learn to avoid mistakes, as I mentioned in the mistake reduction bullet point, but they also learn which coding practices are generally considered good ideas by the industry at large and which practices are not. The compiler will tell you that things are illegal and warn you that others are probably errors, but static analysis tools often answer the question “is this a good idea.” Over time, developers start to understand subtle nuances of software engineering.

There are a couple of criticisms of static analysis. The main ones are that the tools can be expensive and that they can create a lot of “noise” or “false positives.” The former is a problem for obvious reasons and the latter can have the effect of counteracting the time savings by forcing developers to weed through non-issues in order to find real ones. However, good static analysis tools mitigate the false positives in various ways, an important one being to allow the shutting off of warnings and the customization of what information you receive. NDepend turns out to mitigate both: it is highly customizable and not very expensive.

Reference

The contents of this post were mostly taken from a Pluralsight course I did on static analysis with NDepend. Here is a link to that course. If you’re not a Pluralsight subscriber but are interested in taking a look at the course or at the library in general, send me an email to erik at daedtech and I can give you a 7 day trial subscription.

May
8

By Erik Dietrich

The Larger Significance of the Royal TDD Rumble

Category: The Life of a Programmer | 15 Comments

Let’s Get Ready to Rumble

Tomorrow (this) morning at 11 AM Eastern time, the world is going to tune in on Pay Per View to the greatest rumble this side of the Thrilla in Manilla that pitted Muhummed Ali against Joe Frazier. Nearly 40 years after the original, the 2014 incarnation will pit the creator of Ruby on Rails against the creator of Extreme Programming and the TDD approach as well as an internationally acclaimed author and speaker in and about the field of software engineering. The venue will be google hangouts. Get your tickets before the internet sells out.

The reason for the brouhaha? A post entitled, “TDD is dead. Long live testing.” From there, a long series of subsequent posts, debates, arguments, etc ensued across the development world. So, after a couple of weeks of this, the battle was set between the TDD antagonist and a couple of proponents. Or something like this. (Might be that this isn’t a death match, since Martin Fowler explained, “We hope to be witty and entertaining, but we are more interested in exploring these topics in a way that expands our understanding of each others’ ideas,” but what’s the fun in that?)

Of late and I guess in general, I’ve written a lot of posts about unit testing, automated verification and test driven development, so suffice it to say that I’m a fan of this practice. So, one might think that I’d watch this hangout rooting for Fowler and Beck to win, the way I root for the Chicago Bears on fall Sundays. But to be honest, it doesn’t really bother me that someone, even a well-respected and widely known industry figure, doesn’t do things the way I do them. I don’t need or even want everyone to agree with me if for no other reason than this state of affairs would present me with new ideas to consider. If DHH doesn’t find TDD to be beneficial, I’m sort of curious as to why, and I might disagree, but it’s no skin off my nose.

Might Doesn’t Make Right

The tone of this post started off breezy and satirical, but I’d like to get a little more serious because I think there’s a lot of effort, money and happiness at stake. There are two competing attitudes at play in the hubbub surrounding this conversation and pretty much any tendency toward disagreement: “my way worked for me” and “my way is the right way.” I try hard — so very hard — to be someone who says or means the first thing. It is my sincere hope that this attitude permeates any blog posts I make, content I publish, and talks that I give — that they all have an implied “this is based on my experience, YMMV.” I might not always succeed, but that’s my goal. I take this with me at work as well; even with people that report to me when I’m reviewing code I don’t tell them what to do, but instead offer what I would do in their situations. I think this is critical with knowledge workers. It becomes a matter of persuading rather than forcing, which, in turn, becomes a matter of needing to earn respect rather than “might makes right.”

The reason, I think, that the Expert Beginner series resonated with so many people is because the Expert Beginner is “might makes right” incarnate, and it’s something that has invariably squashed your motivation and made you hate your life at some point during your career. Expert Beginners say “my way is the right way” and even “my way is the only way.” This is a characteristic of micromanagement and lack of vision that reduces knowledge workers to brainless cogs, and therein lies the effort, money, and happiness to which I referred.

I’ve left projects or even jobs to avoid “might makes right” Expert Beginnerism, and no doubt so have you. This is lost opportunity and value. Poof. Gone. 2 weeks notice, “knowledge transfer,” on-boarding, and starting all over. Running in place for organizations and projects.

Now, all Expert Beginners tend to think, “my way is the right way,” but not all people who think this way are Expert Beginners. In fact, some are legitimate experts whose way probably is a great way, and maybe even “the right way.” It’s human nature, I believe, to think this way. People form opinions and they want others to agree with those opinions. If we decide that a gluten free diet is really the way toward better health, not only do we expect others to be interested in our newfound wisdom and to agree with it, but we’re even offended when people don’t. I’ve been bored to tears watching this phenomenon take place over and over again in my Facebook feed.

As humans, we seem naturally to want our advice to be heeded, our opinions validated and our approaches mimicked. In my travels, I started doing TDD, and I found that it helped me tremendously and made me more productive — so much so that I post about it, create courses on it, and teach it to developers on my teams. So, if someone comments on a post about it saying, “that’s just a bunch of hippie crap,” my natural inclination is to be offended and to feel combative in response (and indeed, phrased that way, confrontation would appear to be the goal of such a comment, no doubt coming from someone who felt that my posts were an affront to the approaches he’d found to be helpful).

But I try to fight this as much as I can, and listen more to other people and see if I can understand why they do what they do. The irony is, this approach, counter to human nature, is actually in my best interests. I learn far more by listening to what others do than evangelizing for what I do. It’s hard to go this route, but likely beneficial.

Now, I’m not attempting a relativistic argument here to say that if someone’s experience is that they’ve “enjoyed success” by shipping non-compiling code that’s just as valid as someone who delivers reliable software. It’s not that everything is just a matter of feelings and opinions. Approaches and outcomes can and should be empirically measured and validated so that objective, collective progress can be made. Approach X may indeed by better than approach Y, measured by some metric, but those in camp X should be persuading and demonstrating, rather than shouting and flaming.

I doubt the software industry is unique, but it’s the one with which I’m familiar. And it’s frankly exhausting at times to see how much time people spend getting worked up and defensive because someone doesn’t share their love of some framework, approach, or technology. It becomes heated; it becomes personal. You’re an unprofessional hack because you use programming language X. You’re a sellout because you buy tools from company Y. My goodness people, settle down. Honest debates and exchanges accomplish a lot more for everyone. The esteemed gentlemen having the hangout tomorrow to discuss the merits of TDD seem to be favoring the “cooler heads” approach and it’d be nice for everyone to follow suit.

I think test driven development is the best approach to writing software to which I’ve been exposed (of course, or else I wouldn’t do it). I think that if you have an open mind and are willing to try it, I can persuade you that this is the case. But maybe I can’t. And maybe, there’s some as-yet undiscovered approach that someone will show me and I’ll be sold on that instead. Whatever the case may be, I’d like to see these types of exchanges favor attempts at persuasion as opposed to shouting, heated personal exchanges, or, worst of all, “might makes right” types of fiats. And, I think at the end of the day, we all need to come to grips with the knowledge that not everyone out there is going to share our opinions on the best way to accomplish a task. And hey, if your way is really better, it sure looks like you have a competitive advantage now, doesn’t it?

May
7

By Erik Dietrich

Chess TDD 5: Bounded Collections of Moves

Category: You Asked For It Tags: C#, ChessTDD, TDD, Unit Testing | 4 Comments

Really no housekeeping or frivolous notes for this entry. It appears as though I’ve kind of settled into a groove with the production, IDE settings, etc. So from here on in, it’s just a matter of me coding in 15-20 minute clips and explaining myself, notwithstanding any additional feedback, suggestions or questions.

Here’s what I accomplish in this clip:

Change all piece GetMovesFrom methods to use BoardCoordinate type.
Got rid of stupid Rook implementation of GetMovesFrom()
Made a design decision to have GetMovesFrom take a “board size parameter.”
Rook.GetMovesFrom() is now correct for arbitrary board sizes.
Updated Rook.GetMovesFrom() to use 1 indexing instead of 0 indexing to accommodate the problem space.
Removed redundant intsantiation logic in RookTest
Got rid of redundant looping logic in Rook.GetMovesFrom()

Here are some lessons to take away:

Sometimes you create a failing test and getting it to pass leads you to changing production code which, in turn, leads you back into test code. That’s okay.
Sometimes you’re going to make design decisions that aren’t perfect, but that you feel constitute improvements in order to keep going. Embrace that. Your tests will ensure that it’s easy to change your design later, when you understand the problem better. Just focus on constant improvements and don’t worry about perfection.
“Simplest to get tests passing” is a subjective heuristic to keep you moving. If you feel comfortable writing a loop instead of a single line or something because that seems simplest to you, you have license to do that…
But, as happened to me, getting too clever all in one shot can lead to extended debug times that cause flow interruptions.
Duplication in tests is bad, even just a little.
It can sometimes be helpful to create constants or readonly properties for common test inputs and descriptive names. This eliminates duplication while promoting readability of tests.

May
4

By Erik Dietrich

Encapsulation vs Inversion of Control

Category: Language Agnostic Tags: Architecture, Software Engineering | 23 Comments

This is a post that I’ve had in my drafts folder for nearly two years. Well, I should say I’ve had the title and a few haphazard notes in my drafts folder for nearly two years; I’m writing the actual post right now. The reason I’ve had it sitting around for so long is twofold: (1) it’s sort of a subjective, tricky topic and (2) I’ve struggled to find a concrete stand to take. But, I think it’s relatively important, so I’ve always vowed to circle back to it and, well, here we are.

The draft was first conceived when I’d given a presentation about testability and inversion of control — specifically, using dependency injection via constructors and setters to achieve these ends. I talked, among other things about the Open/Closed Principle, and how this allows modifications to system behavior that favor adding over editing code. The idea here is that we can achieve new functionality with a minimum of violence to the code base and in a way for which it is easy to write unit tests. Everyone wins, right?

Well, not everyone was drinking the Kool-Aid. I fielded a question about encapsulation that, at the time, I hadn’t prepared to answer. “Doesn’t this completely violate encapsulation?” I was a little poleaxed, and sputtered out an answer off the cuff, saying basically, “well, not completely….” I mean, if you’ve written a class that takes ILogger in its constructor and uses it for logging, I control the logger implementation but you control when and how it is used. So, you encapsulate the logger’s usage but not its implementation and this stands in contrast to what would happen if you instantiated your own logger or implemented it yourself — you would encapsulate everything and nothing would be up to me as a client of your code. Certainly, you have more encapsulation. So, I finished: “…not completely…. but who cares?!?” And that was the end of the discussion since we were out of time anyway.

I was never really satisfied with that answer but, as Creedence Clearwater says, “time and tears went by, and I collected dust.” When I thought back to that conversation, I would think to myself that encapsulation was passe in the same way that deep inheritance hierarchies were passe. I mean, sure, I learned that encapsulation was one of the four cornerstone principles of OOP, but so is inheritance, and that’s kind of going away with “favor composition over inheritance.” So, hey, “favor dependency injection over encapsulation.” Right? Still, I didn’t find this entirely satisfying — just good enough not to really occupy much of a place in my mind.

But then I remember a bit of a brouhaha last year over a Stack Overflow question. The question itself wasn’t especially remarkable (and was relatively quickly closed), but compiler author and programming legend Eric Lippert dropped by to say “DI is basically a bad idea.” To elaborate, he said:

There is no killer argument for DI because DI is basically a bad idea. The idea of DI is that you take what ought to be implementation details of a class and then allow the user of the class to determine those implementation details. This means that the author of the class no longer has control over the correctness or performance or reliability of the class; that control is put into the hands of the caller, who does not know enough about the internal implementation details of the class to make a good choice.

I was floored. Here we have one of the key authors of the C# compiler saying that the “D” in the SOLID principles was a “bad idea.” I would have dismissed it as blasphemy if (1) I were the sort to adopt approaches based on dogma and (2) he hadn’t helped author at least 4 more compilers than I have. And, while I didn’t suddenly rip the IoC containers out of my projects and instantiate everything inside constructors, I did revisit this topic in terms of my thoughts.

Maybe encapsulation, in the information hiding sense, isn’t so passe. And maybe DI isn’t a magic bullet. But why not? What’s wrong with the author of a class ceding control over some aspects of its behavior by allowing collaboration? And, isn’t any method parameter technically a form of DI, if we’re going to be pedantic about it?

The more I thought about it, the more I started to see competing and interesting use cases. Or, I should say, the more I started to think what class authors are telling their collaborators by using each of these techniques:

Encapsulation: “Don’t worry — I got this.”
Dependency Injection: “Don’t worry — if this doesn’t work, you can always change it.”

So, let’s say that you’re writing a Mars Rover or maybe a compiler or something. The attitude that you’re going to bring to that project is one in which correctness, performance and reliability are all incredibly important because you have to get it right and there’s little room for error. As such, you’re likely going to adopt implementation preference of “I’m going to make absolutely sure that nothing can go wrong with my code.”

But let’s say you’re writing a line of business app for Initrode Inc and the main project stakeholder is fickle, scatterbrained, and indecisive. Then you’re going to have an attitude in which ease and rapidity of system changes is incredibly important because you have to change it fast. As such, you’re likely to adopt an implementation preference of “I’m going to make absolutely sure that changing this without blowing everything up is easy.”

There’s bound to be somewhat of an inverse relationship between flexibility and correctness. As a classic example, a common criticism of Apple’s “walled garden” approach was that it was so rigid, while a common praise of the same was how well it worked. So I guess my take-away from this is that Dependency Injection and, more broadly, Inversion of Control, is not automatically desirable, but I also don’t think I can get to Eric’s take that it’s “basically a bad idea,” either. It’s simply an exchange of “more likely to be correct now” for “more likely to be correct later.” And in the generally agile world in which I live and with the kind of applications that I write, “later” tends to give more value.

Uncle Bob Martin once said, I believe in his Clean Coders video series, that (paraphrased) the second most important characteristic of good software is that it meet the customer requirements. The most important characteristic is that it be easy to change. Reason being, if a system is correct today but rigid, it will be wrong tomorrow when the customer wants changes. If the system is wrong today but flexible, it’s easy to make it right tomorrow. It may not be perfect, but I like DI because I need to be right tomorrow.

May
1

By Erik Dietrich

Merging Done Right: Semantic Merge

Category: .NET Tags: C#, Semantic Merge | 3 Comments

There are few things in software development as surprisingly political as merging your code when there are conflicts.

A Tale of Merge Politics

Your first reaction to this is probably to think that I’m crazy, but seriously, think about what happens when your diff tool/source control combo tells you there’s a conflict. You peer at the conflict for a moment and then think, “alright, who did this?”

Was it Jim? Well, Jim’s kind of annoying and pretty new, so it’s probably fine just to blow his changes away and send him an email telling him to get the latest code and re-do his stuff.

Oh, wait, no, it looks like it was Janet. Uh oh.

She’s pretty sharp and a Principal so you’ll probably be both wrong and in trouble if you mess this up — better just revert your changes, get hers and rework your stuff.

Oh, on third look, it appears that it was Steve, and since Steve is your buddy, you’ll just go grab him and work through the conflict together.

Notice how none of this has anything to do with what the code should actually look like?

Standard Diff Tools are Bad at Telling You What Happened

Now, I’ll grant that this isn’t always the case; there are times when you can figure out what should be in the master copy of the source control. But it’s pretty likely that you’ve sat staring at merge conflicts and thinking about people and not code.

Why is that?

Well, frankly because merge tools aren’t very good at telling you the story of what’s happened, and that’s why you need a human to come tell you the story. But which human, which story, and how interested you are in that interaction are all squarely the stuff of group dynamics and internal politics. Hopefully you get on well with your team and they’re happy to tell you the story.

But what if your tools could tell you that story?

What if, instead of saying, “Jim made text different on lines, 100, 124, 135-198, 220, 222-228,” your tooling said, “Jim moved a method, and deleted a few references to a field whereas you edited the method that he moved?”

Holy crap! You wouldn’t need to get Jim at all because you could just say, “oh, okay, I’ll do a merge where we do all of his stuff and then make my changes to that method he moved.”

Introducing Semantic Merge

I’ve been poking around with Roslyn and reading about it lately, and this led me to Semantic Merge. This is a diff tool that uses Roslyn, which means that it’s parsing your code into a syntax tree and actually reasoning about it as code, rather than text (or text with heuristics).

As such, it’s no mirage or trickery that it can say things like “oh, Jim moved the method but left it intact whereas you made some changes to it.” It makes perfect sense that it can do this.

Let’s take a look at this in action. I’m only showing you the tiniest hint of what’s possible, but I’d like to pick out a very simple example of where a traditional merge tool kind of chokes and Semantic Merge shines. It is, after all, a pay to play (although pretty affordable) tool, so a compelling case should be made.

The Old Way

Before you see how cool Semantic Merge is, let’s take a look at a typical diff scenario. I’ll do this using the Visual Studio compare tool that I use on a day to day basis.

And I’m calling this “the old way,” in spite of the fact that I fell in love with this as compared to the way it used to be in VS2010 and earlier. It’s actually pretty nice as far as diff tools go. I’m going to take a class and make a series of changes to it.

Here’s the before:

public class SomeClass
{
    private int _aNumber;
    private string _aWord;

    /// 
    /// Initializes a new instance of the SomeClass class.
    ///
    public SomeClass() 
    {
      _aNumber = 123; 
      _aWord = "Hello!"; 
    }

    public void PrintNumbers() 
    {
      for (int index = 0; index < _aNumber; index++) 
        Console.WriteLine(index); 
    } 

    public void PrintEvenNumbers() 
    { 
      for (int index = 0; index < _aNumber; index += 2) 
        Console.WriteLine(index); 
    }
   
    public void ChangeNumber(int number) 
    {
      if (number < 0) 
        throw new ArgumentException("number"); 

      _aNumber = number; 
    } 

    public void PrintTheWord() 
   { 
     Console.WriteLine(_aWord); 
   } 
   
   public void ChangeTheWord(string newWord) 
   {
     _aWord = newWord; 
   }
}

Now, what I’m going to do is swap the positions of PrintNumbers() and ChangeTheWord(), add some error checking to ChangeTheWord() and delete the comments above the constructor. Here’s the after:

public class SomeClass
{
    private int _aNumber;
    private string _aWord;

    public SomeClass()
    {
        _aNumber = 123;
        _aWord = "Hello!";
    }

    public void ChangeTheWord(string newWord)
    {
        if(string.IsNullOrEmpty(newWord))
            throw new ArgumentException("newWord");
        _aWord = newWord;
    }
        
    public void PrintEvenNumbers()
    {
        for (int index = 0; index < _aNumber; index += 2)
            Console.WriteLine(index);
    }

    public void ChangeNumber(int number)
    {
        if (number < 0)
            throw new ArgumentException("number");

        _aNumber = number;
    }

    public void PrintTheWord()
    {
        Console.WriteLine(_aWord);
    }

    public void PrintNumbers()
    {
        for (int index = 0; index < _aNumber; index++)
            Console.WriteLine(index);
    }
}

If I now want to compare these two files using the diff tool, here’s what I’m looking at:

There’s a Better Way to Handle This

This is the point where I groan and mutter to myself because it annoys me that the tool is comparing the methods side by side as if I renamed one and completely altered its contents entirely.

I’m sure you can empathize. You’re muttering to yourself too and what you’re saying is, “you idiot tool, it’s obviously a completely different method.”

Well, here’s the same thing as summarized by Semantic Merge:

It shows me that there are two types of differences here: moves and changes. I’ve moved the two methods PrintNumbers() and ChangeTheWord() and I’ve changed the constructor of the class (removing comments) and the ChangeTheWord() method.

Pretty awesome, huh? Rather than a bunch of screenshots to show you the rest, however, however, I’ll show you this quick clip of me playing around with it.

Some very cool stuff in there. First of all, I started where the screenshot left off — with a nice, succinct summary of what’s changed.

From there you can see that it’s easy to flip back and forth between the methods, even when moved, to see how they’re different. You can view each version of the source as well as a quick diff only of the relevant, apples-to-apples, changes.

It’s also nice, in general, that you can observe the changes according to what kind of change they are (modification, move, etc). And finally, at the end, I played a bit with the more traditional diff view that you’re used to — side by side text comparison.

But even with that, helpful UI context shows you that things have moved rather than the screenshot of the VS merge tool above where it looks like you’ve just butchered two different methods.

This is only scratching the surface of Semantic Merge. There are more features I haven’t covered at all, including a killer feature that helps auto-resolve conflicts by taking the base version of the code as well as server and local in order to figure out if there are changes only really made by one person.

You can check more of it out in this extended video about the tool. As I’ve said, it’s a pay tool, but the cost isn’t much and there’s a 30 day trial, so I’d definitely advise taking it for a spin if you work in a group and find yourself doing any merging at all.

By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.

DaedTech

Introduction to Static Analysis (A Teaser for NDepend)

Analysis vs Reactionary Inspection

Static vs Dynamic Analysis

Source Code Parsing vs. Compile-Time Analysis

The Types of Static Analysis

A Quick Overview of Some Example Metrics

Why Do This?

Reference

The Larger Significance of the Royal TDD Rumble

Let’s Get Ready to Rumble

Might Doesn’t Make Right

Chess TDD 5: Bounded Collections of Moves

Encapsulation vs Inversion of Control

Merging Done Right: Semantic Merge

A Tale of Merge Politics

Standard Diff Tools are Bad at Telling You What Happened

Introducing Semantic Merge

The Old Way

There’s a Better Way to Handle This

About Me

Developer Hegemony

Search the Site

Post Archives By Month