DaedTech

Stories about Software

By

Inverting Control

I imagine that inversion of control is a relatively popular concept to talk or blog about, particularly in object-oriented circles, so rather than do a garden-variety explanation of the term followed by a pitch for using it, I thought I’d take a slightly different approach. I’m going to talk about the reason that there is often resistance to the concept of inversion of control, why that resistance is understandable, and why that understandable resistance is misguided. This involves a foray into the history of programming and general concepts of human reasoning.

In the Beginning

Nobody starts out writing object oriented programs, nor do they start out understanding concepts like lambda expressions, function evaluation, etc. People start out, almost invariably, with the equivalent of batch scripts. That is, they learn how to automate small procedures and they build from there. This is a natural and understandable progression in terms of individual learning and also in terms of our journey as a programming collective. The earliest programs were sequences of instructions. They had abbreviated syntax, control structures, and goto statements that automated a task from beginning to end.

An example is something like the following (in pseudo code):

start:
file = "numbers.txt"
if not file exists
touch "numbers.txt"
goto exit
x = ""
open "numbers.txt" > x
exit:

The logic is simple and easy enough to follow. Look for a file called numbers.txt and create it if it doesn’t exist. Otherwise, read it. Now, as you want to add things to this program, it gets longer and probably harder to follow. There may be some notion of looping, handled with a loop construct, or, if sufficiently primitive in terms of time or level of code (i.e. if we’re at the chip level), with goto statements to process the contents of the file.

Procedural Code to the Rescue

As things evolved, the notion of subroutines was introduced to help alleviate complexity and make programs more readable and the concept of procedural or structural programming was born. I believe Dijkstra famously declared that the evolution of this paradigm should make it such that the goto statement was never used again. Structural/procedural programming involves creating abstractions of commonly used routines so that they can be reused and so that the program is more readable and less error prone.

//functions elided
int main(int argc, char* argv[])
{
    char* file;
    file = get_filename();
    if(!file_exists(file))
    {
        create_file();
        return 0;
    }
    read_file();
}

First off, pardon my C syntax. I did not compile this and haven’t written actual C code in a while. But, you get the idea. Here we have an implementation where details are mercifully abstracted away into functions. We want the name of the file, so we call “get_filename()” and let someone else handle it. We want to know if the file exists, so we abstract that as well. Same goes for creating or reading the file. The main routine is much more legible, and, better yet, other people can also call these methods, so you don’t need to copy and paste code or fix errors in multiple places if there are problems.

Procedural programming is powerful, and it can be used to produce very clean code and readable code. Many of those who use it do just that. (Though many also don’t and pride themselves instead on packing the most conceptual C functionality into a single line of hidden pointer arithmetic, tertiary operators and assignments in control structures, but I digress.) And because of its power and long history of success, it imprinted itself very clearly on the minds of people who used it for years and got used to its idioms and nuances.

Let’s think about how a procedural programmer tends to reason about code. That is, there is some main function, and that main function calls a sub-routine/function to handle some of its workload. It delegates to a more abstract function to handle things. Unlike the first example, as procedural code grows, it doesn’t get longer and harder to read. Instead, it grows into more files and libraries. The functions called by main are given their own functions to refer to, and the structure of the program grows like a tree, rather than a beanstalk to the heavens. Main is the root, and it branches out to the eventual leaves which are library functions.

Another way to think of this is command and control. Main is like the White House. It makes all of the big decisions and it delegates to the Cabinet for smaller but still important things. Each Cabinet member has his or her own department and makes all of the big decisions, delegating to underlings the details of smaller picture operations. This continues all the way down the chain of government until someone somewhere is telling someone how much bleach to use when cleaning the DMV floor. The President doesn’t care about this. It’s an inconsequential detail that should be handled without his intervention. Such is the structure of the procedural program as well. It mirrors life in that it mirrors a particular metaphor for accomplishing tasks – the command and control method of delegation.

The reason I go into all of this detail is that I want you to get inside the mind of someone who may be resistant to the concept of inversion of control. If you’re used to procedural programming and the command and control metaphor, then you’re probably nodding along with me. If you’re a dyed-in-the-wool OO programmer who uses Spring framework or some other IOC container, you’re probably getting ready to shout that your code isn’t the US government. That’s fine. We’ll get to that. But for now, think about your procedural-accustomed peer and realize that what you’re suggesting to him or her seems like the equivalent of asking the President of the US to run out to buy bleach for the guy at the DMV to clean the floor. It’s preposterous without the proper framing.

A New Way of Thinking

So, what is the proper framing? Well, after procedural code was well-established, the idea of object-oriented programming came into existence. On its face, this was a weird experiment, and there was no shortage of people that saw this as a passing fad. OOP flew completely in the face of established practice and, frankly, common sense. Instead of having procedures that delegated, we would now have objects that encapsulated properties and functionality. It sounds completely reasonable now, but this was a relatively revolutionary concept.

In the early stages of OOP, people did some things that were silly in retrospect. People got object-happy and in a rush to the latest, greatest thing, created objects for everything. No doubt there were for loop and while loop objects and someone had something like Conditional.If(x == 5).Then().Return(x); On the opposite end of the spectrum, there were some people who had been writing great software with procedural code for 20 years and they weren’t about to stop now, thank-you-very-much. And C++, the most popular early OOP language, put out places at the table for both camps. C++ allowed C programmers to write C and compile it with the C++ compiler, while it allowed OOP fanatics to pursue their weird idioms before eventually settling down into a good rhythm.  The publication of books about patterns and anti-patterns helped OOP fans continue their progress.

As these groups coexisted, the early-adopters blazed a trail and the late-adopters grudgingly adopted when they had to. The problem was in how they went about adopting. To a lot of people in this category, a “class” was roughly equivalent to a procedural library file with a group of associated functions. And a class can certainly serve this purpose, despite the point of OOP being completely missed. Anybody who has seen classes with names like “FileUtils” or “FinancialConversions” knows what I’m talking about. These are the calling cards of procedural programmers ordered to write in an object-oriented language without real introduction to object-oriented concepts.

Mixed Metaphors

So what? Well, the end-game here is that this OOP/procedural hybrid is firmly entrenched in countless legacy applications and even ones being developed today by people who learned procedural thinking and never really had that “Aha!” moment with object-oriented thinking. It isn’t simply that classes in these applications function as repositories for loosely related methods, but that the entire structure of the program follows the command and control metaphor in an object-oriented world.

And, what is the object-oriented world? I personally think a good metaphor for it is Legos. If you’re a kid with a bunch of Lego kits and parts at your disposal and you want a scene of a bunch of pirate ships or space ships doing battle, you build all of the little components for a ship first. Then, with the little components assembled, you snap them together into larger and larger components until you’ve built your first ship. Sometimes, you prototype a component and use it in multiple places. You then repeat this as necessary and assemble the ships into some grand imitation of adventure on the high seas. This is the fundamental nature of object-oriented programming. There is no concept of delegation in the command and control sense — quite the opposite — the details are assembled first into ever-larger pieces. An equally suitable and less childlike metaphor may be the construction of a building.

As a procedural “President,” you would be ill at ease in this world. You stand there, demanding that non-existent ships assemble themselves by having their hulls assemble themselves by having their internal pieces assemble themselves. You’re yelling a lot, but nothing’s happening and there is, actually, nobody or no thing there to yell at.

Of course, the procedural folks aren’t actually that daft, so what they do instead is force the Lego world to be a command and control world. They lay out the ship’s design and architecture, but they also add a functionality to the ship where it constructs its own internals. That is to say, they start with small stuff like our object-oriented folk, but the small stuff is all designed to give the big things the ability to create the small things themselves. They do this at every level, giving every object extra responsibility and self-awareness so that at the end, they can have a neat, clean soapbox from which to say:

int main(int argc, char* argv[])
{
    Ship ship1 = new Ship();
    ship1.Build();
    Ship2 = new Ship();
    ship2.Build();
    ....
}

No fuss, no muss (after all the setup overhead of teaching your ships how to build themselves). You simply declare a ship, and tell it to build itself, at which point the ship creates a hull, which it tells to build itself, and so on down the line.

I realize that this sounds insane, probably even to those procedural programmers out there. But it only sounds that way because I’ve changed the metaphor. It made a lot more sense when the President was telling someone to have his people call their people. And, with that metaphor, the object-oriented approach sounded insane, as we’ve covered with the President buying bleach at the corner store for a janitor to clean a floor somewhere.

Getting it Right

So, back to inversion of control (IOC). IOC only makes sense in the Lego metaphor. If you eventually want to build a pirate ship, you start to think about what you need for a pirate ship. Well, you need a crow’s nest and a hull and a plank, and — well, stop right there. We’re getting ahead of ourselves. Before we can build a ship, we clearly need some other stuff. So, let’s focus on the crow’s nest. Well, we need some pieces of wood and that’s about it, so maybe that’s a good place to start. We’ll think about the properties of a piece of wood and define it. It’s probably got three spatial dimensions and a weight and a type, so we can start there. Once we’ve defined and described our wood, we can figure out how to assemble it into a crow’s nest, and so on.

Object-oriented programming is about creating small components to be assembled into larger components. And, inversion of control follows naturally from there — much more naturally than command and control. And it should follow into your programming.

If you’re creating a presentation tier class that drives some form in your UI, one of the first things that’s going to occur to you is that you need to obtain some data from somewhere. In the procedural world, you would say, “Aha! I need to define some service class that goes out and gets that data so I’ll have my presentation tier class instantiate this service and…” But stop. You don’t want to do that. You want to think in the Lego metaphor. As soon as you need something that you don’t have, it’s time to stop working on that class and move to the class that yours needs (if it doesn’t already exist).

But, before you leave, you want to document for posterity that your presentation tier class is useless without a service to provide data. What better way to do that than to make it impossible to create an instance of your class without passing in the service you need? That puts the handwriting on the wall, in 8000 point font, that your presentation tier class needs a service or it won’t work. Now you can go create the service, or find it if it exists, or ask the guy who works on the service layer to write it.

But where does the class calling yours get the service? Who cares. I say that with a period instead of a question mark because it’s declarative. That isn’t your problem, and it isn’t your presentation tier class’s problem. And the reason for that is that your pirate ship isn’t tasked with building its own hull and crow’s nest. Somebody else builds those things and hands them over for ship construction when they’re completed.

Back to the Code

That was a long and drawn out journey, but I think it’s important to drive home the different metaphors and their implications for the sake of this discussion. Without that, procedural and OOP folks are talking at each other and not understanding one another. If you’re trying to sell IOC to someone who isn’t buying it, you’re much better served understanding their thinking. And if you’re one of those resisting buying it, it’s time to get out your wallet because the debate is over. IOC produces cleaner code that is easier to maintain, test, and extend. Procedural coding has its uses, but if you’re already using an OO language, you don’t have a candidate for those uses.

So, what are the actual coding implications of inversion of control? I’ll enumerate some here to finish up.

1. Classes have fewer reasons to change

One of the guiding principles of clean code and keystone member of SOLID is the “Single Responsibility Principle.” Your classes should have only one reason to change because this makes them easier to reason about, and it makes changes to the code base produce less violence and upheaval that triggers regressions. If you use the procedural style to create classes, your classes will always have at least two reasons to change: (1) their actual function changes; and (2) the way they create their sub-components changes.

2. Classes are easier to unit test

If you’re looking to unit test a command and control pirate ship, think about everything that happens in the PirateShip’s constructor. It news up a hull, crow’s nest, etc, which all new up their internals, and so on, recursively down the structure of the application. You cannot unit test PirateShip at all. It encapsulates all of the functionality of your program. In fact, you can’t unit test anything except the tree leaves of functionality. Pretty much all of your tests are system/integration tests. If you invert control, it’s easy. You just pass in whatever you want to the class to poke it and prod it and see how it behaves.

3. No global variables or giant method signatures

Imagine that your crow’s nest wants to communicate with the ship’s rudder for some reason. These classes reside at the complete alpha and omega of the program and are tree leaves. In command and control style, you have two options. The first is to have all of the nodes in between the root and those leaves pass the message along in constructors or mutators. As the amount of overhead resulting from this gets increasingly absurd, most procedural programmers turn to option 2: the global variable (or, its gussied-up object-oriented counterpart loved by procedural programmers everywhere – the singleton). I’ll save that for another post, as it certainly deserves its own treatment in depth, but let’s just say, for argument’s sake and for the time being, that this is undesirable. Every class in the application doesn’t need to see the personal business of those two and how they communicate.

In the IOC model, this is a simple prospect. Because you’re building all of the sub-components and assembling them into increasingly large components, there is one place that has reference to everything. From the class performing all of that assembly, it’s trivial to link those two leaf nodes or, really, any classes. I can give one a reference to the other. I can create a third object to which both refer or give them both references to it. There are any number of options that don’t involve hideous method signatures or globals.

4. Changing the way things are constructed is easy

Speaking of having your assembly all in one place, swapping out components becomes simple. If I want the crow’s nest to use balsa wood instead of pine, I just go to the place that crow’s nest is instantiated and pass it something else. I don’t have to weed through my code base looking for the class and then trace the instantiation path to where I need it. All of the instantiation logic happens centrally. This makes it way easier to introduce conditions for using different types of construction logic that don’t clutter up your classes and that don’t care how their components are constructed. In fact, if you use Spring or some IOC container, you can abstract this out of your program altogether and have it reside in some configuration file, if you’re feeling particularly “enterprisey” (to borrow a term from an amusing site I like to visit).

5. Design by contract becomes easy as well

This is another thing to take at face value for the time being, but having your components interact via interface is much easier this way. Interface programming is a good idea in general. But, if you’re not inverting control, it’s kind of pointless. If all of your object creation is hard-coded throughout the application, interfacing is messy. If your PirateShip is creating a CrowsNest and you want a CrowsNest interface with command and control, you’d have to pass some enumeration (or use some global) into PirateShip to tell it what kind of CrowsNest to instantiate. This, along with some of the other examples, demonstrates the first point about code bloat and the SRP. As I’m introducing these new requirements (which generally happen sooner or later), our procedural classes get bigger, more complicated, and more difficult to maintain. Now they not only need to instantiate their components, but also make decisions about how to instantiate them based on additional variables that you need to put in them. And, I promise, there will be more.

Fin

So, I hope some of you reading find this worthwhile. I’m not much of a proselytizer for or true adherent to any one way of doing things. Procedural programming and other styles of programming have their places (I literally just used ‘goto’ today in writing a batch script for executing some command line utilities). But, if you’re in the OO world, using the OO metaphors and surrounded by OO people, it is clearly in your best interest to adapt instead of fight.

By

DXCore Plugin Part 2

In the previous post on this subject, I created a basic DXCore plugin that, admittedly, had some warts. I started using it and discovered that there were subtle issues. If you’ll recall, the purpose of this was to create a plugin that would convert some simple stylistic elements between my preferences and those of a group that I work with. I noticed two things that caused me to revisit and correct my implementation: (1) if I did more than one operation in succession, things seemed to get confused and garbled in terms of variable naming; and (2) I was not getting renames in all scopes.

The main problem that I was having was the result of calling LanguageElement.Document.SetText(). Now, I don’t really understand the pertinent details of this because the API is a black box to me, but I believe the gist of it, based on experience and poking around blog posts, is that calling this explicitly causes the document to get out of sync if you chain renames together.

For instance, take the code:

void Foo()
{
    string myString = "asdf";
    int myInt = myString.Length();
}

The way that DXCore’s Document API appears to process this is with a concept called “NameRange.” That is, there are various ways you can use LanguageElements and other things in that inheritance tree to get a particular token in the source file: the string type, the Foo method signature, the “myString” variable, etc. When you actually want to change something name-wise, you need to find all references to your token and do an operation that essentially takes the parameters (SourceRange currentRange, string new text). In this fashion, you might call YourClass.Document.SetText(myStringDeclaration.NameRange, “newMyString”);

Assuming that you’ve rounded up the proper LanguageElement that corresponds to the declaration of the variable “myString”, this tells DXCore that you want to change the text from “myString” to “myNewString”. Conceptually, what you’re changing is represented by some int parameters that I believe correspond to row and column in the file, ala a 2D array. So, if you make a series of sequential changes to “myString” (first lengthen it, then shorten it, then lengthen it again) strange stuff starts to happen. I think this is the result of the actual allocated space for this token getting out of whack with what you’re setting it to. It sometimes starts to gobble up characters after the declaration like Windows when you hit the “insert” key without realizing it. I was winding up with stuff like “string myStringingingingd”;”

So, what I found as a workable fix to this problem was to use a FileChangeCollection object to aggregate the changes I wanted to make as I went, rather than making them all at once. FileChangeCollection takes a series of FileChange objects which want to know the path of the class, the range of the proposed change target, and the new value. I aggregated all of my changes in this collection and then flushed them at the end with CodeRush.File.ChangeFile(_targetClass.GetSourceFile(), _collection); After doing that, I cleared the collection so that my class can reuse it.

This cleared up the issue of inconsistent weirdness in naming. Now I can convert back and forth as many times as I please or run the same conversion over and over again and then run the other one, and atomicity of the standards application is perceived. If I run “convert to theirs, convert to theirs, convert to theirs, convert to mine,” the code ends up retaining my standards perfectly, regardless of whose it started with. This is due both to me getting DXCore right (at least as far as my testing so far proves) and also to my implementation being consistent and unit-tested. However, confidence that I have the DXCore implementation right now allows me in the future to know that, if things are wacky, it’s because I’m not implementing string manipulation and business rules correctly.

The second issue was that some variables weren’t getting renamed properly, depending on the scope. Things that were in methods were fine, but anything in nested Linq queries or loops or what-have-you were failing. I had previously made a change that I thought took care of this, but it turns out that I just pushed the problem down a level or two.

This I ultimately solved with some recursion. My conversion functions take an (element, targetScope) tuple, and they add FileChanges for all elements in the target scope. They then call the same function for all scoped children of targetScope with (element, targetScopeChild). If you think of the source as a tree with the root being your scope, intermediate nodes being scopes with children, and leaves being things containing no nested scoping language elements, this walks your source code as if it were a tree, finding and renaming everything.

Here is an example of my recursive function for adding changes to a class field “_collection”, which corresponds to a FileChangeCollection (no worries, I’m renaming that field to something more descriptive right after I finish this post 😀 )

/// Abstraction for converting all elements in a target scope
/// Element whose name we want to convert
/// Target scope in which to do the conversion
private void RecursiveAggregateLocalChanges(LanguageElement local, LanguageElement target)
{
    VerifyOrThrow();

    foreach (IElement myElement in local.FindAllReferences(target))
    {
        var myFieldInstance = CodeRush.Source.GetLanguageElement(myElement);
        if (myFieldInstance != null)
        {
            AddFileChange(_targetClass.GetFullPath(),
            myFieldInstance.NameRange, _converter.ConvertLocal(local.Name));
        }
    }

    foreach (LanguageElement myElement in target.Nodes)
    {
        RecursiveAggregateLocalChanges(local, myElement);
    }
}

You want to preserve “local” as a recursion invariant, since this method is called by the method that handles cycling through all local variables in the file and invoking the recursion to change their names. That is, the root of the recursion is given a single local variable to change as well as the target scope of the method it resides in. From here, you change the local everywhere it appears within the method, then you get all of the methods child scopes and do the same on those. You keep recursing until you run out of nested scopes, falling out of the recursion at this point having done your adds.

It does not matter whether you recurse first or last because CodeRush keeps track of the original element regardless and even if it didn’t, you’re only aggregating eventual changes here – not making them as you go – so you don’t wind up losing the root.

Hopefully this continues to be somewhat helpful. I know the DXCore API documentation is in the works, so this is truly the diary of someone who is just tinkering and reverse engineering (with a bit of help from piecing together things on other blogs) to get something done. I’m hardly an expert, but I’m more of one than I was when I started this project, and I find that the most helpful documentation is often that made by someone undertaking a process for the first time because, unlike an expert, all the weird little caveats and gotchas are on display since they don’t know the work-arounds by heart.

I will also update the source code here in a moment, so it’ll be fresh with my latest changes.

By

Static Analysis — Spell Check for Code

A lot of people have caught onto certain programming trends: some agility in the process generally makes things better, unit testing a code base tends to make it more reliable, etc. One thing that, in my experience, seems to lag behind in popularity is the use of static checking tools. If these are used at all, it’s usually for some reason such as enforcing capitalization schemes of variables or some other such thing that guarantees code that is uniform in appearance.

I think this non-use or under-use of such tools is a shame. I recently gave a presentation on using some tools in C# and Visual Studio 2010 for static analysis, and thought I’d share my experience with some of the tools and the benefits I perceive here. In this development environment, there are six tools that I use, all to varying degrees and for different purposes. They are:

Before I get into that, I’ll offer a little background on the idea of static analysis. The age-old and time-tested way to write code is that you write some code, you compile it, and then you run it. If it does what you expected when you run it, then you declare yourself victorious and move on. If it doesn’t, then you write some more code and repeat.

This is all fine and good until the program starts getting complicated — interacting with users, performing file I/O, making network requests, etc. At that point, you get a lot of scenarios. In fact, you get more scenarios than you could have anticipated in one sitting. You might run it and everything looks fine, but then you hand it to a user who runs it, unplugs the computer, and wonders why his data wasn’t saved.

At some point, you need to be able to reason about how components of your code would behave in various scenarios, even if you might not easily be able to recreate these scenarios. Unit testing is helpful with this, but unit testing is just an automated way of saying, “run the code.” Static analysis automates the process of reasoning about the code without running it. It’s like you looking at the code, but exponentially more efficient and much less likely to make mistakes.

Doing this static analysis is adding an extra step to your development process. Make no mistake about that. It’s like unit testing in that the largest objection is going to be the ‘extra’ time that it takes. But it’s also like unit testing in that it saves you time downstream because it makes defects less likely to come back to bite you later. These two tasks are also complimentary and not stand-ins for one another. Unit testing clarifies and solidifies requirements and forces you to reason about your code. Static analysis lets you know if that clarification and reasoning has caused you to do something that isn’t good.

As I said in the title, it’s like a spell checker for your code. It prevents you from making silly and embarrassing mistakes (and often costly ones). To continue the metaphor, unit testing is more like getting someone bright to read your document. He’ll catch some mistakes and give you important feedback for how to improve the document, but he isn’t a spell checker.

So, that said, I’ll describe briefly each one and why I use and endorse it.

MS Analysis

MS Analysis encapsulates FX Cop for the weightier version of Visual Studio 2010 (Premium and up, I think). It runs a series of checks, such as whether or not parameters are validated by methods, whether or not you’re throwing Exception instead of SomeSpecificException, and whether your classes have excessive coupling. There are probably a few hundred checks in all. When you do a build with this enabled, it populates the error list with violations in the form of warnings.

On the plus side, this integrates seamlessly with Visual Studio since it’s a Microsoft product, and it catches a lot of stuff. On the down side, it can be noisy, and customizing it isn’t particularly straightforward. You can turn rules on and off, but if you want to tweak existing ones or create your own, things get a little more complicated. It also isn’t especially granular. You configure it per project and can’t get more fine grained feedback than that (i.e. per namespace, class, or method).

My general use of it is to run it periodically to see if my code is violating any of the rules that I care about. I usually turn off a lot of rules, and I have a few different rulesets that I plug in and out so that I can do more directed searches.

Style Cop

StyleCop is designed to be run between writing and building. Instead of using the VM/Framework to reflect on your code, it just parses the source code file looking for stylistic concerns (are all of your fields camel cased and documented and are you still using Hungarian notation and, if so, stop) and very basic mistakes (like, do you have an empty method). It’s lightning fast, and it runs on a per-class basis, which is cool.

On the downside, it can be a little annoying and invasive, but the designers are obviously aware of this. I recall reading some kind of caveat stating that the nature of these types of rules tends to be arbitrary and get opinionated developers into shouting matches.

I find it useful for letting me know if I’ve forgotten to comment things, if I’ve left fields as anything other than private, and if I have extra parentheses somewhere. I run Style Cop occasionally, but not as often as others. Swapping between the rule sets is a little annoying.

CodeRush

CodeRush is awesome for a lot of things, and its static analysis is really an ancillary benefit. It maintains an “issues list” for each file and highlights these issues in real time, right in the IDE. A few of them are a little bizarre (suggesting to always use “var” keyword if it is possible), but most of them are actually really helpful and not suggested by the MS Tools or anything else I use. It does occasionally false flag dead code and get a few things wrong, but it’s fairly easy to configure it to ignore issues on a per file, per namespace, per solution basis.

The only real downside here is that CodeRush has a seat licensing cost and that and the other overhead of CodeRush make it a little overkill-ish if you’re just interested in Static Analysis. I fully endorse getting CodeRush in general, however, for all of its features.

Code Contracts

Like CodeRush, this tool is really intended for something else, and it provides static analysis as a side effect. Code Contracts is an academically developed tool that facilitates design by contract. Pre- and post-conditions as well as class invariants can be enforced at build time. Probably because of the nature of doing this, it also just so happens to offer a feature wherein you can have warning squigglies pop up anywhere you might be dereferencing null, violating array bounds, or making invalid arithmetic assumptions.

To me, this is awesome, and I don’t know of other tools that do this. The only downside is that, on larger projects, this takes a long time to run. However, getting an automatic check for null dereferences is worth the wait!

I use it explicitly for the three things I mentioned, though, if I get a chance and enough time, I’d like to explore its design by contract properties as well.

NDepend

NDepend is really kind of an architectural tool. It lets you make assessments about different dependencies in your code, and it provides you with all kinds of neat graphs that report on them. But my favorite feature of NDepend is the static analysis in the form of Code Querying. It exposes SQL-like semantics that let you write queries against your code base, such as “SELECT ALL Methods WHERE CyclomaticComplexity > 25” (paraphrase). You can tweak these things, write your own, or go with the ones out of the box. They’re all commented, too, in ways that are helpful for understanding and modifying.

There is really no downside to NDepend aside from the fact that it costs some money. But if you have the money to spare, I highly recommend it. I use this all the time for querying my code bases in whatever manner strikes my fancy.

Nitriq

I think that Nitriq and NDepend are probably competitors, but I won’t do any kind of comparison evaluation because I only have the free version of Nitriq. Nitriq has the same kind of querying paradigm as NDepend, except that it uses LINQ semantics instead of SQL. That’s probably a little more C# developer friendly, as I suppose not everyone that writes C# code knows SQL (although it strikes me that all programmers ought to have at least passing familiarity with SQL).

In the free version you can only assess one assembly at a time, though I don’t consider that a weakness. I use Nitriq a lot less than NDepend, but when I do fire it up, the interface is a little less cluttered and it’s perhaps a bit more intuitive. Though, for all I know, the paid version may get complicated.

Conclusion

So, that’s my pitch for static analysis. The tools are out there, and I suspect that they’re only going to become more and more common. If this is new to you, check these tools out and try them! If you’re familiar with static analysis, hopefully there’s something here that’s new and worth investigating.

By

Addicted to Unit Testing

Something interesting occurred to me the other day when I posted sample code for a DXCore plugin that I created. In the code that I uploaded, I added a unit test project with a few unit tests as a matter of course. Apparently, the process of unit testing has become so ingrained in me that I didn’t think anything of it until later. This caused me to reflect a bit on my relationship, as a developer, to unit testing.

I’ve worked in settings where unit tests have been everything from mandated to tolerated to scoffed at and discouraged. And I’ve found that I do some form of unit testing in all of these environments. In environments that require unit testing, I simply include them the way everyone else does, conforming to standards. In environments that discourage unit testing, I implement them and keep them to myself. In that case, the tests aren’t always conventional or pretty, but I always find some way to automate verification. To me, this is in my blood as a programmer. I want to automate everything and I don’t see why verifying my code’s functionality should be any different.

But I realize that it’s now something beyond just a desire to automate, tinker, and take pride in my work. The title of this post is tongue-in-cheek, but also appropriate. When I write code and don’t do some verification of it, I start to feel edgy and off of my game. It’s hard for me to understand how people can function without unit testing their code. How do they know it works? How do they know they’ve handled edge cases and bad input to methods? How do they feel comfortable building new classes that depend on the correct functioning of the ones that came first? And, most importantly, how do they know they’re not running in place when adding functionality — breaking an existing requirement for each new one they satisfy?

I exhibit some of the classic signs of an addict. I become paranoid and discombobulated without the thing upon which I depend. I’ll go to various lengths to test my code, even if it’s not factored into the implementation time (work longer hours, create external structure, find free tools, etc.). I reach for it without really thinking about it, as evidenced by the unit tests in my uploaded code.

But I suppose the metaphor ends there. Because unlike the vices about which recovering addicts might speak this way — drugs, alcohol, gambling, etc. — I believe this ‘addiction’ makes me better as a software engineer. I think most people would agree that tested code is likely to be better code, and strong unit test proponents would probably argue that the habit makes you write better code whether or not you unit test a given class at a given time. For example, when I create a dependency-injected class, the first code I tend to write, automatically and out of habit, is an “if” statement in the constructor that checks the injected references for null and throws an exception if they are. I write this because the first unit test that I write is one to check how the class behaves when injected with null.

And, to me, that’s really the most core benefit of unit testing. Sure, it makes refactoring easier, it enumerates your requirements better than a requirements analysis document could, it inspires confidence in the class behavior, and all of the other classic properties of unit testing as a process stalwart. But I think that, at the core, it changes the way you think about classes and how you write code. You write code knowing that it should behave in a certain way with certain inputs and that it should provide certain outputs. You think in terms of what properties your classes have and what they should initialize to in different instantiation scenarios. You think of your classes as units, and isn’t that the ultimate goal — neat, decoupled code?

One common theme that I see in code and I’ve come to think of as highly indicative of a non-unit testing mentality is a collection of classes that distribute their functionality in ad-hoc fashion. That is, some new requirement comes in and somebody writes a class to fulfill it — call it “Foo.” Then a few more requirement riders trickle in as follow ups, and since “Foo” is the thing that satisfies the original, it should also satisfy the new ones. It’s perfectly fine to call it “Foo” because it represents a series of user requests and not a conceptual object. Some time later, more requirements come in, and suddenly the developer needs two different Foos to handle two different scenarios. Since Foo is getting kind of large, and large classes are bad, the solution is to create a “FooManager” that knows all about the internals of “Foo” and to spread functionality across both. If FooManager needs internal Foo field “_bar”, “_bar” is made into a property “Bar” (C#) or an accessor “GetBar()” (Java/C++), and the logic proceeds. Foo does some things to its former private member “Bar” and then “FooManager” also does some things to Foo’s Bar, and before you know it, you have a Gordian knot of functional and temporal coupling.

I don’t think this kind of code would ever exist in a world populated only by unit testing addicts. Unit testing addiction forces one to consider upfront a class’s reason for existing and carefully deliberate its public interface. The unit testing addict would not find himself in this precarious situation. His addiction would save him from this “rock bottom” of software development.

By

Creating a DxCore plugin

Coding standards are one of those things that generally involve some degree of compromise, as there is often a goal during collaboration to give the code a uniform look and feel. I don’t necessarily agree with this goal in all cases, but I do understand it. Having code formatted in wildly different styles in the same solution or assembly steepens the learning curve for newcomers to the project and will have some negative effects on maintenance activities.

So, what to do when the group’s agreed-upon coding standards don’t match your own preferences? Most would simply adapt. I’m not really like most, though. I’m an inveterate tinkerer and heavy customizer. In all of my IDEs, I use a black background and arrange the colors just so. This mentality extends to everything from my windows folder views to my coding idioms. Everything that I do, I make a point to do because of conscious deliberation and not simply because I’m used to doing it. There is a method to my madness. So my solution to being asked to adopt a common set of standards was to write an IDE plugin that would convert my idioms to the group’s and vice-versa.

Because I’ve really been enjoying the DxCore set of tools lately, I decided to go this route. Here are the steps that I’ve taken in order to get a plugin up and running. At the time of writing, this plugin is fairly simple. I like to use camel casing in my code, and I distinguish between class fields, method parameters, and local variables by using the following idioms, respectively: _classField, methodParameter, myLocalVariable (properties and methods are ClassProperty and ClassMethod()). My rationale is that this allows me at a glance to tell what each thing is that I’m referring to. The standard of one of the groups in which I’m developing is _ClassField and localVariable (sans my). I don’t prefer this because I can’t distinguish between method parameters and my local variables at a glance, meaning I don’t know whether I own the reference or my caller does. So, the plugin converts all class variables and method variables back and forth between these idioms, and that’s it.

For usability purposes, I created a CodeRush action to which I could bind a shortcut key sequence, and I adorned it with some niceties such as putting it in a menu and showing a BIGHINT when the actions are executed. Since the documentation for the DxCore API is somewhat scattered and not always complete, a fair bit of reverse engineering went into this, so I make no claims that what I’m doing is the preferred way to do things. But it does work. I may refine this and revisit this as I get more familiar with it, but here is what I’ve done so far (using CodeRush/DXCore 10.2.5 and Visual Studio 2010):

Getting started

Getting started with a new DXCore plugin is pretty straightforward. Open a new Visual Studio project, go to the DevExpress menu, and select “New Plugin.”

Create a new DXCore plugin

From there, give your plugin a descriptive name that suits your purposes and take care of all of the particulars of naming the classes the way you want. Then you’re ready to go. This link describes some steps in detail, including selecting manual rather than automatic loading of the plugin that you’re creating (you probably don’t want Visual Studio scooping this up until it’s tested and stable): Create a plugin. This blog was actually what I referred to for the how-to of creating this.

Lights, Camera, Action

This will create an empty plugin project ready to use. In order to have it do anything useful, you must open the Toolbox from the designer and drag and drop an Action (from the DXCore portion) onto your designer.

Create a new action

Once the action is on your designer page, double click on it.  You will be booted into the code behind, ala classic Visual Studio development. Here, an event handler will be created for the action’s execution. This is going to be the plugin’s entry point. We’ll bind the action to a shortcut key sequence in the IDE, and this is the method to which you’ll add code to manipulate the IDE.

Newly created event handler

For now, if you want to verify that something is happening, I would suggest dropping a “Big Hint” onto your designer, naming it something (I got creative and left it as “bigFeedback1”), and adding the following to your execute event handler:

bigFeedback1.Text = "Hello World!"
bigFeedback1.Show();

Now that you have an actual action and something that it does, you’re ready to try it out. Follow the instructions on Dror Helper’s blog for running the plugin. Basically, you’re going to do a normal run which will, in turn, launch a new instance of Visual Studio. Assuming that you’ve bound a shortcut sequence to your new action, once you load your plugin and execute that sequence, you will see that big CodeRush feedback:

Big Hint

(Clearly, my code does not say “Hello World” at the time of running, but you get the idea.)

One thing to look out for is an exception message when you run. I don’t know if it’s something about my setup specifically, but I see the following:

Loader Lock message

For me, ignoring this and hitting continue makes everything go fine. YMMV.

The code to make it go

I wrote a good bit of code and some unit tests to make this work. The full, zipped solution (as of this writing) can be downloaded here for anyone interested. I certainly won’t go into everything, especially the logic of modifying the variable names. But I will go over a few concepts with which I struggled a bit and had to learn by trial and error.

The DXCore engine has a very deep inheritance hierarchy, and it’s important to find the right level of object in the hierarchy to get the operations that you want. This API is not yet thoroughly documented, so it can be a bit confusing. The first thing that you’re going to be interested in for my plugin is the ActiveClass. This corresponds to the class that’s currently active in the IDE. ActiveClass is obtained by querying static class CodeRush — specifically, CodeRush.Source.ActiveClass.

ActiveClass exposes IEnumerables of TypeDeclaration called “AllFields”, “AllMethods”, “AllProperties”, etc. that correspond to language elements of interest in the class. However, you don’t want TypeDeclarations — you want LanguageElements.  So you’ll want to cast these guys as LanguageElement. Doing so exposes a “NameRange” property, which is what you pass to ActiveClass.Document.SetText(namerange, newname) to change the name of the variable. You’re also going to be interested in LanguageElement’s “FindAllReferences()” method, which returns an IEnumerable of IElements.  You’ll need this in order to pass to CodeRush.Source.GetLanguageElement(IElement) so that additional LanguageElements correspond to additional instances of your element besides its declaration.

And, finally, if you want to check certain properties of a language element (is it private, protected, etc., what its parent element is, etc.), you will need to cast it to AccessSpecifiedElement.

This is not incredibly intuitive, so I’m hoping that this helps people — at least by pointing in the right direction of which types and classes to look at in the meta-data or to play with.

Adding your action to menus

This part is relatively straightforward once you know what to do, but it took me a while to figure that out. I was looking to create something like this where I could right-click inside the class and apply my conversions:

Context Menu

So, as it turns out, if you right-click on your action in the designer and display the properties, you can set this all through the properties. I’ve circled the ones that I needed to set in order to achieve the desired effect in the previous screenshot. If you want to create another layer of sub-menuing, I have no idea how this would be achieved. But, I don’t want to do that right now, so that’s that.

Adding to the context menu

If you want to add it to one of the existing menus in Visual Studio like the edit menu or the DevExpress menu, you can do that too.

Switching out of Development Mode

When you’re finished with developing and ready for your plugin to be loaded automatically when Visual Studio and DXCore startup, navigate to the project’s AssemblyInfo.cs and edit the properties of the assembly DXCoreAssembly. Change “PluginLoadType.Demand” to “PluginLoadType.Startup”, and now your plugin will be available for loading at startup. You will still need to configure it in the DevExpress plugin manager.

An interesting thing to note is that DXCore’s plugin creator automatically sets the build output to go in the DevExpress plugins directory (DevExpress\IDE Tools\Community\PlugIns). So if you want to redistribute this guy, you can just grab the DLL from there and email it to your friends, or whatever.

Follow Up

This is very much a work in progress, and I expect to polish this tool more as I use it. I may revisit this post and edit a bit if it needs it or create follow-up posts if there is interest. One thing that comes to mind off the top is that I plan to disable the conversions menu item if the user does not click inside a class. Other nice features might be to play with and set some default settings on the context picker for users to have some guidance when creating keyboard shortcuts.

If anyone has questions, comments, or suggestions for improvement, all are welcome.

Edit: I just now uploaded a slightly modified version of the plugin about a day after this post. I discovered that I local variables were not being modified properly in methods if they were nested in any kind of scoping delimiter, like an “if” statement or a try/catch block. The reason for this was that I was previously checking to see if the AccessSpecifiedElement had a parent of type Method, not realizing that these other scopers were considered logical parents. I modified that code to instead call GetMethod() and check to see if it was null. There’s doubtless some better heuristic for distinguishing locals, but this one seems to work.

Edit2: I have created a new post detailing additional progress on this matter.