DaedTech

Stories about Software

By

Improving Readability using Moq

While I’ve been switching a lot of my new projects over to using JustMock, as explained here, I still have plenty of existing projects with Moq, and I still like Moq as a tool. One of the things that I have struggled with in Moq that JustMock provides a nice solution to is the cognitive dissonance that arises when thinking about how to reason about your test doubles.

In some situations, such as when you’re injecting the test doubles into your class under test, you want them to appear as if they were just another run of the mill instance of whatever interface/base you’re passing. But at other times you want to configure the mock, and then you want to think of them as actual mock objects. One solution to that is to use Mock.Get() as I described here. Doing that, you can store a reference to the object as if it were any other object but access it using Mock.Get() when you want to do things to its setup:

[gist id=”4998372″]

That’s all well and good, but, perfectionist that I am, I got really tired of all of that Mock.Get() ceremony. It makes lines pointlessly longer, and I think really detracts from the readability of the test classes. So I borrowed an idea from JustMock with its “Helpers” namespace and created a series of extension methods. Some examples are shown here.

[gist id=”4998277″]

These extension methods allow me to alter my test to look like this:

[gist id=”4998488″]

Now there’s no need to switch context, so to speak, between mock object and simple dependency. You always have a simple dependency that just so happens to have some extension methods that you can use for configuration. No need to keep things around as Mock<T> and call the Object property, and no need for all of the Mock.Get().

Of course, there are caveats. This might already exist somewhere and I’m out of the loop. There might be issues with this that I haven’t yet hit, and there is the debugging indirection. And finally, you could theoretically have namespace collisions, though if you’re making methods called “Setup” and “Verify” on your classes that take expression trees and multiple generic parameters, I’d say you have yourself a corner case there, buddy. But all that aside, hopefully you find this useful–or at least a nudge in a direction toward making your own tests a bit more readable or easy to work with.

By

API Lost Opportunity: A Case Study

I’m aware that recently there’s been some brouhaha over criticism/trashing of open-source code and the Software Craftsmanship movement, and I’d like to continue my longstanding tradition of learning what I can from things without involving myself in any drama. Toward that end, I’d like to critique without criticizing, or, to put it another way, I’d like to couch some issues I have scan0001with an API as things that could be better rather than going on some kind of rant about what I don’t like. I’m hoping that I can turn the mild frustration I’m having as I use an API into being a better API writer myself. Perhaps you can pick up a few tidbits from this as well.

Recently I’ve been writing some code that consumes the Microsoft CRM API using its SDK. To start off with the good, I’d say that API is fairly powerful and flexible–it gives you access to the guts of what’s in the CRM system without making you write SQL or do any other specific persistence modeling. Another impressive facet is that it can handle the addition of arbitrary schema to the database without any recompiling of client code. This is done, in essence, by using generalized table-like objects in memory to model actual tables. Properties are handled as a collection of key-value string pairs, effectively implementing a dynamic-typing-like scheme. A final piece of niceness is that the API revolves around a service interface which makes it a lot nicer than many things for mocking when you test.

(If you so choose, you can use a code generator to hit your CRM install and spit out strongly-typed objects, but this is an extra feature about which it’s not trivial to find examples or information. It also spits out a file that’s a whopping 100,000+ lines, so caveat emptor.)

But that flexibility comes with a price, and that price is a rather cumbersome API. Here are a couple of places where I believe the API fails to some degree or another and where I think improvement would be possible, even as the flexible scheme and other good points are preserved.

Upside-Down Polymorphism

The API that I’m working with resides around a series of calls with the following basic paradigm:

VeryAbstractBaseRequest request = new SomeActualConcreteRequest();
VeryAbstractBaseResponse response = service.Execute(request);

There is a service that accepts very generic requests as arguments and returns very generic responses as return values. The first component of this is nice (although it makes argument capture during mocking suck, but that’s hardly a serious gripe). You can execute any number of queries and presumably even create your own, provided they supply what the base wants and provided there isn’t some giant switch statement on the other side downcasting these queries (and I think there might be, based on the design).

The second component of the paradigm is, frankly, just awful. The “VeryAbstractBaseResponse” provides absolutely no useful functionality, so you have to downcast it to something actually useful in order to evaluate the response. This is not big-boy polymorphism, and it’s not even adequate basic design. In order to get polymorphism right, the base type would need to be loaded up with methods so that you never need to downcast and can still get all of the child functionality out of it. In order not to foist a bad design on clients, it has to dispense with needing to base your response type on the request type. Here’s what I mean:

VeryAbstractBaseRequest request = new GetSomeCookiesRequest();
VeryAbstractBaseResponse response = service.Execute(request);
var meaningfulResponse = (GetSomeCookiesResponse)response; //Succeeds only because I have inappropriate knowledge of an implied coupling
var secondmeaningfulResponse = (GetSomeBrowniesResponse)response; //Bombs out because apparently brownies and cookies aren't fungible
var thirdMeaningfulResponse = (GetSomeBrownieBytesResponse)response; //Succeeds oddly because apparently BrownieBytes are children of Coookies (or a conversion operator exists)

Think about what this means. In order to get a meaningful response, I need to keep track of the type of request that I created. Then I have to have a knowledge of what responses match what requests on their side, which is perhaps available in some horrible document somewhere or perhaps is only knowable through trial/error/guess. This system is a missed opportunity for polymorphism even as it pretends otherwise. Polymorphism means that I can ask for something and not care about exactly what kind of something I get back. This scheme perverts that–I ask for something, and not only must I care what kind of something it is, but I also have to guess wildly or fail with runtime exceptions.

The lesson?

Do consumers of your API a favor. Be as general as possible when accepting arguments and as specific as possible when returning them. Think of it this way to help remember. If you have a method that takes type “object” and can actually do something meaningful with it, that method is going to be incredibly useful to anyone and everyone. If you have a method that returns type object, it’s going to be pretty much useless to anyone who doesn’t know how the method works internally. Why? Because they’re going to have to have an inappropriate knowledge of your workings in order to know what this thing is they’re getting back, cast it appropriately, and use it.

Hierarchical Data Transfer Objects

With this interface, the return value polymorphism issues might be livable if not for the fact that these request/response objects are also reminiscent of the worst in OOP, reminiscent of late ’90s, nesting-happy Java. What does a response consist of? Well, a response has a collection of entities and a collection of results, so it’s not immediately clear which path to travel, but if you guess right and query for entities, you’re just getting started. You see, entities isn’t an enumeration–it’s something called “EntityCollection” which brings just about nothing to the table except to provide the name of the entities. Ha ha–you probably wanted the actual entities, you fool. Well, you have to call response.EntityCollection.Entities to get that. And that’s an enumeration, but buried under some custom type called DataCollection whose purpose isn’t immediately clear, but it does implemented IEnumerable so now you can get at your entity with Linq. So, this gets you an entity:

RetrieveMultipleResponse response = new RetrieveMultipleResponse();
Entity blah = response.EntityCollection.Entities.First();

But you’re quite naieve if you thought that would get you anything useful. If you want to know about a property on the entity, you need to look in its Attributes collection which, in a bit of deja vu, is an AttributeCollection. Luckily, this one doesn’t make you pick through it because it inherits from this DataCollection thing, meaning that you can actually get a value by doing this:

RetrieveMultipleResponse response = new RetrieveMultipleResponse();
object blah = response.EntityCollection.Entities.First().Attributes["nameOfMyAttribute"];

D’oh! It’s an object. So after getting your value from the end of that train wreck, you still have to cast it.

The lesson?

Don’t force Law of Demeter violations on your clients. Provide us with a way of getting what we want without picking our way through some throwback, soul-crushing object hierarchy. Would you mail someone a package containing a series of Russian nesting dolls that, when you got to the middle, had only a key to a locker somewhere that the person would have to call and ask you about? The answer is, “only if you hated that person.” So please, don’t do this to consumers of your API. We’re human beings!

Unit Tests and Writing APIs

Not to bang the drum incessantly, but I can’t help but speculate that the authors of this API do not write tests, much less practice TDD. A lot of the pain that clients will endure is brought out immediately when you start trying to test against this API. The tests are quickly littered with casts, as operators, and setup ceremony for building objects with a nesting structure four or five references deep. If I’m writing code, that sort of pain in tests is a signal to me that my design is starting to suck and that I should think of my clients.

I’ll close by saying that it’s all well and good to remember the lessons of this post and probably dozens of others besides, but there’s no better way to evaluate your API than to become a client, which is best done with unit tests. On a micro-level, you’re dog-fooding when you do that. It then becomes easy to figure out if you’re inflicting pain on others, since you’ll be the first to feel it.

By

Switch Statements are Like Ants

Switch statements are often (and rightfully, in my opinion) considered to be a code smell. A code smell, if you’ll recall, is a superficial characteristic of code that is often indicative of deeper problems. It’s similar in concept to the term “red flag” for interpersonal relationships. A code smell is like someone you’ve just met asking you to help them move and then getting really angry when you don’t agree to do it. This behavior is not necessarily indicative of deep-seated psychological problems, but it frequently is.

Consequently, the notion that switch statements are a code smell indicates that if you see switch statements in code, there’s a pretty good chance that design tradeoffs with decidedly negative consequences have been made. The reason I say this is that switch statements are often used to simulate polymorphism for those not comfortable with it:

public void Attack(Animal animal)
{
    switch (animal.Type)
    {
        case AnimalType.Cat: 
            Console.WriteLine("Scratch"); 
            break;
        case AnimalType.Dog: 
            Console.WriteLine("Bite"); 
            break;
        case AnimalType.Wildebeest: 
            Console.WriteLine("Headbutt"); 
            break;
        case AnimalType.Landshark: 
            Console.WriteLine("Running-bite");
            break;
        case AnimalType.Manticore: 
            Console.WriteLine("Eat people");
            break;
    }
}

Clearly a better design from an OOP perspective would be an Animal base class/interface and an Attack() method on that, overridable by children/implementers. This design has the advantage of requiring less code, scaling better, and conforming to the Open/Closed principle–if you want to add some other animal later, you just add a new class to the code base and probably tweak a factory method somewhere and you’re done.

This method isn’t really that bad, though, compared to how bad it could be. The design is decidedly procedural, but its consequences aren’t far reaching. From an execution perspective, the switch statement is hidden as an implementation detail. If you were to isolate the console (or wrap and inject it), you could unit test this method pretty easily. It’s ugly and it’s a smell, but it’s sort of like seeing an ant crawling in your house–a little icky and potentially indicative of an infestation, but sometimes life deals you lemons.

But what about this:

public string Attack(Animal animal)
{
    switch (animal.Type)
    {
        case AnimalType.Cat:
            return GetAttackFromCatModule();
        case AnimalType.Dog:
            return GetAttackFromDogModule();
        case AnimalType.Wildebeest:
            return GetAttackFromWildebeestModule();
        case AnimalType.Landshark:
            return GetAttackFromLandsharkModule();
        case AnimalType.Manticore:
            return GetAttackFromManticoreModule();
    }
}

Taking the code at face value, this method figures out what the animal’s attack is and returns it, but it does so by invoking a different module for each potential case. As a client of this code, your path of execution can dive into any of five different libraries (and more if this ‘pattern’ is followed for future animals). The fanout here is out of control. Imagine trying to unit test SavageAntthis method or isolate it somehow. Imagine if you need to change something about the logic here. The violence to the code base is profound–you’d be changing execution flow at the module level.

If the first switch state was like an ant in your code house, this one is like an ant with telephone poles for legs, carving a swath of destruction. As with that happening in your house, the best short-term strategy is scrambling to avoid this code and the best long-term strategy is to move to a new application that isn’t a disaster zone.

Please be careful with switch statements that you use. Think of them as ants crawling through your code–ants whose legs can be tiny ant legs or giant tree trunk legs. If they have giant tree trunk legs, then you’d better make sure they’re the entirety of your application–that the “ant” is the brains ala an Ioc container–because those massive swinging legs will level anything that isn’t part of the ant. If they swing tiny legs, then the damage only occurs when they come in droves and infest the application. But either way, it’s helpful to think of switch statements as ants (or millipedes, depending on the number of cases) because this forces you to think of their execution paths as tendrils fanning out through your application and creating couplings and dependencies.

By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.

By

Write Once, Confuse Everywhere

Not too long ago, someone asked me to take a look at a relatively simple application designed with the notion of having some future flexibility in mind. I poked around a bit to see what was where and found a reasonably simple and straightforward application. Like any, it had its warts, but there were no serious surprises–no whiplash-inducing double-takes or WTFs. One thing that I did notice, however, was that there were a bunch of classes that inherited from GUI components and had no implementations at all. So instead of using a TextBox, you might use a “DoNothingTextBox” with class definition DoNothingTextBox : TextBox and no implementation. (It was not actually called “DoNothingTextBox”–that’s just for illustration purposes.)

ConfusedI puzzled over the purpose of these things for a while until I inspected a few more and saw one with some code in it. I saw the purpose then immediately: hooks for future functionality. So if, for example, it were later decided at some point that all TextBoxes should disallow typing of non-alphanumeric characters, the behavior could be implemented in one place and work everywhere.

On the one hand, this does seem like a clever bit of future-proofing. If experience tells you that a common need will be to make every instance of X do Y, then it stands to reason you should make it easy and minimally invasive to touch every X. On the other hand, you’re quite possibly running afoul of YAGNI and definitely running afoul of the Open/Closed Principle by creating classes that you may never use with the specific intention of modifying them later. Also to consider is that this creates a weird flipping of what’s commonly expected with inheritance; rather than creating a derived class to extend existing functionality when needed, you speculatively create an ancestor for that purpose.

Is this a problem? Well, let’s think about how we would actually achieve this “change everything.” With the normal pattern of abstracting common functionality into a base class for reuse, the mechanism for the reuse is derived classes deferring to parent’s centralized implementation. For instance:

public class Bird
{
public virtual void Reproduce()
{
Console.Write("I laid an egg!");
}
}

public class Ostrich : Bird
{
public override void Reproduce()
{
base.Reproduce();
Console.Write("... and it was extremely large as far as bird eggs go.");
}
}

public class Parrot : Bird
{

}

Notice that since all birds reproduce by laying eggs, that functionality has been abstracted into a base class where it can be used everywhere and need not be duplicated. If necessary, it can be extended as in the case of Ostrich, or even overridden (which Ostrich could do by omitting the base class call), but the default is what Parrot does: simply use the common Reproduce() method. The bird classes have a default behavior that they can extend or opt out of, assuming that the base class is defined and established at the time that another bird class extends it.

Aha! This is an interesting distinction. The most common scenario for inheritance schemes is one where (1) the base class is defined first or (2) some duplicated piece of functionality across inheritors is moved up into a base class because it’s identical. In either case, the common functionality predates the inheritance relationship. Birds lay eggs, and part of deciding to be a bird (in the sense of writing a class that derives from Bird) is that default egg-laying behavior.

But let’s go back to our speculative text boxes. What does that look like?


public class DoNothingTextBox : TextBox
{

}

Now let’s say that time goes by and the developers all diligently stick to the architectural ‘pattern’ of using DoNothingTextBox everywhere. Life is good. But one day, some project management stakeholder tells one of the developers more on the UI side of things that all of the text boxes in the application should be green. The developer ponders that for a bit and then says, “I know!”

public class DoNothingTextBox : TextBox
{
public DoNothingTextBox() : base()
{
BackColor = Color.Green;
}
}

“Done and done.” He builds, observes that all text boxes are now green, checks in his changes, and takes off for the day to celebrate a job well done. Meanwhile, a handful of other developers on the team are updating to the latest version of the code to get an unrelated change they need. Each of them pulls in the latest, develops for a bit, launches the app to check out their changes, and, “what the… why is this text box green–how did I manage that?” Their troubleshooting progression probably starts with rolling back various changes. Then it winds its way into the version history of CSS files and styling mechanisms; stumbles into looking at the ASP markup and functionality of the various collaborators and controls on the web control/page; and, only after a great deal of frustration, cursing, and hair-tearing, winds up in the dusty, old, forgotten, formerly-empty DoNothing class.

I submit that the problem here is a subtle but profound one. As I’ve mentioned before, inheritance, by its nature, extends and sometimes modifies existing functionality. But this framework for building out and expanding software relies upon the fact that base classes are inherently more fixed and stable than child inheritors and that the relationship between child classes and base classes is well-defined and fixed at the time of inheritance/extension. To put it more concretely, OO developers will intuitively understand a how to use a “base” bird that lays eggs–they won’t intuitively understand how to use a “base” bird that does nothing and then later, spontaneously turns every bird on earth green. Changes to a base class are violent and confusing when they alter the behavior of inheritors while leaving the inheritors untouched.

So I’d encourage you to be extremely careful with using speculative inheritance structures. The instinct to create designs where potential sweeping changes can be implemented easily is an excellent one, but not all responses to that instinct are equally beneficial. Making a one line code change is certainly quick and creates a minimum of code upheaval for the moment, but once all of the time, effort, and hacking around during troubleshooting by other developers is factored in, the return isn’t quite so good anymore. Preserve that instinct, but channel it into better design decisions. It’s just as important to consider how broadly confusing or unintuitive a change will be as it is to consider how many lines of code it takes and how many files need to be touched.

By

Getting Too Cute with C# Yield Return

I ran across a method that returned an IEnumerable<T> recently, and I implicitly typed its return value. During the course of a series of method extractions, code movement, and general refactoring, I wound up with some code that passed the various unit tests in place but failed curiously at runtime. After peering at it for a few minutes and going through once in the debugger, I traced it to a problem that you don’t see every day, and one that probably would have had me tearing my hair out if I didn’t have a good working understanding of what the “yield” keyword in C# does. So today, I’ll present the essence of this problem in the hopes that, if you weren’t aware of it, you are now.

CuteYieldReturn

Here is an entire class that contains a nested type and a couple of methods, for illustration purposes. At the bottom is a unit test that will, if you copy this into your scratchpad, fail.

public class MiscTest
{
    public class Point
    {
        public int X { get; set; }
        public int Y { get; set; }
    }

    private IEnumerable GetPoints()
    {
        for (int index = 1; index < 20; index++)
            yield return new Point() { X = index, Y = index * 2 };
    }

    private void DoubleXValue(IEnumerable points)
    {
        foreach (var point in points)
            point.X *= 2;
    }

    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void Asdf()
    {
        var points = GetPoints();
        DoubleXValue(points);
            
        Assert.AreEqual(2, points.ElementAt(0).X);
    }
}

It seems pretty straightforward. You have some method that returns a bunch of points, and then you take those points and pass them to a method that iterates through them, performing an operation on each one. So what gives? Why does this fail? Everything looks pretty simple (unlike my situation, where this became removed through a few layers of indirection), and yet we get back 1 when we’re expecting 2.

To understand this, it’s important to understand what yield actually does. At its core, the yield keyword is syntactic sugar that tells the compiler to generate a state machine under the hood. Let that sink in for a moment, because it’s actually kind of a wild concept. You’re used to methods that return references to object instances or primitives or collections, but this is something fundamentally different. A method that returns an IEnumerable and does so using yield return isn’t defining a return value–it’s defining a protocol for interacting with client code.

Consider the code example above. The obvious (and, as it turns out, wrong) way to understand the GetPoints() method is, “it generates a collection of points from (1, 2) to (19, 38) and returns it.” But GetPoints() doesn’t return any such thing. In fact, it doesn’t return anything but a promise–a promise to generate points later if asked. So when we say “var points = GetPoints();” what we’re actually saying is, “the points variable references some kind of points machine that will generate points when I ask for them.”

If we think of it this way, we start to get to the bottom of what’s going wrong here. On the next line, we pass this oracle into the DoubleXValue() method. The DoubleXValue() method iterates through all of the states of the points (state) machine, retrieving points as per the promise. Once it retrieves the point, it does something to the X coordinate and then promptly discards the point. Why? Because nothing else refers to it. When you change one of the points that the points machine spits out, you’re not changing anything about the points machine–you’re not feeding it some kind of new mechanism for point generation. You could think of this as being similar to a method that takes a class factory, requests a bunch of instances from it, modifies them, and then returns. Nothing about the factory is different, and you wouldn’t expect the factory to behave differently if the caller subsequently passed it to another method.

So once the DoubleXValue() method gets done doing, well, nothing of significance, the Assert() call requests the first sequential element–the first state–from the points machine. The points machine dutifully spits out its first state, (1, 2), and the unit test fails. So how do we get it to pass? Well, here’s one way:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Asdf()
{
    var points = GetPoints().ToList();
    DoubleXValue(points);
            
    Assert.AreEqual(2, points.ElementAt(0).X);
}

Notice the added ToList() call. This is very important because it means that we’re no longer storing a reference to some kind of points machine but rather to a list of points. This line now says, “Go get me a points machine, iterate through all the states of it, and store those states locally in a list.” Now, the rest of the code behaves in a way that you’re used to because you’re storing an actual, tangible collection instead of a promise to generate a sequence.

There is no shortage of posts, documents, and articles explaining the yield return state machine concept or the idea of deferred execution. I encourage you to read those to get a better understanding of the inner mechanics and usage scenarios, respectively. But hopefully this gives you a bit of practical insight that’s easy to wrap your head around into (1) why the code behaves this way and (2) why you have to be careful of providing and consuming IEnumerables. It can be tempting to get too cute with how you provide IEnumerables or too careless with how you consume them, particularly when usage and implementation are separated by inversion of control. So be aware when using IEnumerables that you may not have a list/collection, and be aware when providing them that you’re leaving it up to your clients to decide when to get and store sequence members.

By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.