Stories about Software

Jul
3

Proposal: A Law of Performance Citation

Category: Language Agnostic Tags: C#, Performance | 16 Comments

I anticipate this post being fairly controversial, though that’s not my intention. I imagine that if it wanders its way onto r/programming it will receive a lot of votes and zero points as supporters and detractors engage in a furious, evenly-matched arm-wrestling standoff with upvotes and downvotes. Or maybe three people will read this and none of them will care. It turns out that I’m actually terrible at predicting which posts will be popular and/or high-traffic. And I’ll try to avoid couching this as flame-bait because I think I actually have a fairly non-controversial point on a potentially controversial subject.

To get right down to it, the Law of Performance Citation that I propose is this:

If you’re going to cite performance considerations as the reason your code looks the way it does, you need to justify it by describing how a stakeholder will be affected.

By way of an example, consider a situation I encountered some years back. I was pitching in to help out with a bit of programming for someone when I was light on work, and the task I was given amounted to “copy-paste-and-adjust-to-taste.” This was the first red flag, but hey, not my project or decision, so I took the “template code” I was given and made the best of it. The author gave me code containing, among other things, a method that looked more or less like this (obfuscated and simplified for example purposes):

public void SomeMethod()
{
    bool isSomethingAboutAFooTrue = false;
    bool isSomethingElseAboutAFooTrue = false;
    IEnumerable foos = ReadFoosFromAFile();
    for (int i = 0; i < foos.Count(); i++)
    {
        var foo = foos.ElementAt(i);
        if (IsSomethingAboutAFooTrue(foo))
        {
            isSomethingAboutAFooTrue = true;
        }
        if (IsSomethingElseAboutAFooTrue(foo))
        {
            isSomethingElseAboutAFooTrue = true;
        }
        if (isSomethingAboutAFooTrue && isSomethingElseAboutAFooTrue)
        {
            break;
        }
    }

    WriteToADatabase(isSomethingAboutAFooTrue, isSomethingElseAboutAFooTrue);
}

I promptly changed it to one that looked like this for my version of the implementation:

public void SomeMethodRefactored()
{
    var foos = ReadFoosFromAFile();

    bool isSomethingAboutOneOfTheFoosTrue = foos.Any(foo => IsSomethingAboutAFooTrue(foo));
    bool isSomethingElseABoutOneOfTheFoosTrue = foos.Any(foo => IsSomethingElseAboutAFooTrue(foo));

    WriteToADatabase(isSomethingAboutOneOfTheFoosTrue, isSomethingElseABoutOneOfTheFoosTrue);
}

I checked this in as my new code (I wasn't changing his existing code) and thought, "he'll probably see this and retrofit it to his old stuff once he sees how cool the functional/Linq approach is." I had flattened a bunch of clunky looping logic into a compact, highly-readable method, and I found this code to be much easier to reason about and understand. But I turned out to be wrong about his reaction.

When I checked on the code the next day, I saw that my version had been replaced by a version that mirrored the original one and didn't take advantage of even the keyword foreach, to say nothing of Linq. Bemused, I asked my colleague what had prompted this change and he told me that it was important not to process the foos in the collection a second time if it wasn't necessary and that my code was inefficient. He also told me, for good measure, that I shouldn't use var because "strong typing is better."

I stifled a chuckle and ignored the var comment and went back to look at the code in more detail, fearful that I'd missed something. But no, not really. The method about reading from a file read in the entire foo collection from the file (this method was in another assembly and not mine to modify anyway), and the average number of foos was single digits. The foos were pretty lightweight objects once read in, and the methods evaluating them were minimal and straightforward.

Was this guy seriously suggesting that possibly walking an extra eight or nine foos in memory, worst case, sandwiched between a file read over the network and a database write over the network was a problem? Was he suggesting that it was worth a bunch of extra lines of confusing flag-based code? The answer, apparently, was "yes" and "yes."

But actually, I don't think there was an answer to either of those questions in reality because I strongly suspect that these questions never crossed his mind. I suspect that what happened instead was that he looked at the code, didn't like that I had changed it, and looked quickly and superficially for a reason to revert it. I don't think that during this 'performance analysis' any thought was given to how external I/O over a network was many orders of magnitude more expensive than the savings, much less any thought of a time trial or O-notation analysis of the code. It seemed more like hand-waving.

It's an easy thing to do. I've seen it time and again throughout my career and in discussing code with others. People make vague, passing references to "performance considerations" and use these as justifications for code-related decisions. Performance and resource consumption are considerations that are very hard to reason about before run-time. If they weren't, there wouldn't be college-level discrete math courses dedicated to algorithm runtime analysis. And because it's hard to reason about, it becomes so nuanced and subjective in these sorts of discussions that right and wrong are matters of opinion and it's all really relative. Arguing about runtime performance is like arguing about which stocks are going to be profitable, who is going to win the next Super Bowl, or whether this is going to be a hot summer. Everyone is an expert and everyone has an opinion, but those opinions amount to guesses until actual events play out for observation.

Don't get me wrong -- I'm not saying that it isn't possible to know by compile-time inspection whether a loop will terminate early or not, depending on the input. What I'm talking about is how code will run in complex environments with countless unpredictable factors and whether any of these considerations have an adverse impact on system stakeholders. For instance, in the example here, the (more compact, maintainable) code that I wrote appears that it will perform ever-so-slightly worse than the code it replaced. But no user will notice losing a few hundred nano-seconds between operations that each take seconds. And what's going on under the hood? What optimizations and magic does the compiler perform on each of the pieces of code we write? What does the .NET framework do in terms of caching or optimization at runtime? How about the database or the file read/write API?

Can you honestly say that you know without a lot of research or without running the code and doing actual time trials? If you do, your knowledge is far more encyclopedic than mine and that of the overwhelming majority of programmers. But even if you say you do, I'd like to see some time trials just the same. No offense. And even time trials aren't really sufficient because they might only demonstrate that your version of the code shaves a few microseconds off of a non-critical process running headlessly once a week somewhere. It's for this reason that I feel like this 'law' that I'm proposing should be a thing.

Caveats

First off, I'm not saying that one shouldn't bear efficiency in mind when coding or that one should deliberately write slow or inefficient code. What I'm really getting at here is that we should be writing clear, maintainable, communicative and, above all, correct code as a top priority. When those traits are established, we can worry about how the code runs -- and only then if we can demonstrate that a user's or stakeholder's experience would be improved by worrying about it.

Secondly, I'm aware of the aphorism that "premature optimization is the root of all evil." This is a little broader and less strident about avoiding optimization. (I'm not actually sure that I agree about premature optimization, and I'd probably opt for knowledge duplication in a system as the root of all evil, if I were picking one.) I'm talking about how one justifies code more than how one goes about writing it. I think it's time for us to call people out (politely) when they wave off criticism about some gigantic, dense, flag-ridden method with assurances that it "performs better in production." Prove it, and show me who benefits from it. Talk is cheap, and I can easily show you who loses when you write code like that (hint: any maintenance programmer, including you).

Finally, if you are citing performance reasons and you're right, then please just take the time to explain the issue to those to whom you're talking. This might include someone writing clean-looking but inefficient code or someone writing ugly, inefficient code. You can make a stakeholder-interest case, so please spend a few minutes doing it. People will learn something from you. And here's a bit of subtlety: that case can include saying something like, "it won't actually affect the users in this particular method, but this inefficient approach seems to be a pattern of yours and it may well affect stakeholders the next time you do it." In my mind, correcting/pointing out an ipso facto inefficient programming practice of a colleague, like hand-writing bubble sorts everywhere, definitely has a business case.

Jul
1

By Erik Dietrich

I Don’t Know What X is on Line 48 And I Don’t Care

Category: Language Agnostic Tags: C# | 4 Comments

I was at a users’ group recently here in Chicago, and there were two excellent presenters. There were two very well done presentations: one about Xamarin and one about the C# language entitled “Underestimated C# Language Features” by John Michael Hauck. This was a very polished and appealing talk in which he wove together some important differentiators for the C# language, including closures, anonymous methods, and deferred execution. Given that he’s presented this at some conferences as well, I wasn’t surprised by the high caliber of the presentation.

The format was one in which he presented a series of increasingly difficult problems and asked what different variables were at different points of the program’s execution. For example, here’s something I just made up that would have looked at home in one of his many examples:

public void LetsSeeWhatHappens()
{
    int x = 100;
    Console.WriteLine(x);
    foreach (var f in GetSomeFuncs())
    {
        x++;
        Console.WriteLine(f(++x));
    }
    Console.WriteLine(x);
}

public IEnumerable> GetSomeFuncs()
{
    for (int index = 0; index < 3; index++)
        yield return i => index * i;
}

With each example like this, he’d ask the audience what they thought would be the output. And then the fun would begin. There’d always be several different answers and heated disagreement. After all, bragging rights and programmer cred was at stake. This was public gamification at its finest, and it reminded me vaguely of being in a sports bar and listening to people debating heatedly what the next play call would be in the Monday night football game:

Person 1: Third and 7. They have to pass!
Person 2: I think they’re going to call a draw.
Person 3: You’re nuts! That’s stupid! It’ll be a screen pass!
Person 1: Come on, you’ve been wrong every time!!!

I was entertained by this. John was an engaging presenter and the material interested me, but I discovered I had no interest in actually guessing, even to myself, to say nothing of out loud. At one point, a friend of mine that I was sitting with said, “who knows — write a unit test to figure it out,” and I certainly agreed with that point. But that wasn’t it. I think it was just that I was more interested in what the answer would teach me about the language and its nuance than I was in somehow testing myself. Frankly, this seemed like trivia to me and getting the right answer didn’t seem as important as taking the opportunity to hone my critical thinking skills.

Figuring this out, I realized it explained why I was impatient with all of the guessing that people were doing. John would ask people what they thought, and they would guess but then also start explaining their reasoning and arguing with one another. This struck me as boring and inefficient. “Just shut up and let him hit F10, and we’ll have the answer,” I thought. In that moment, I also realized that a very mild pet peeve of mine has always been code reviews or other places where people argue over runtime behavior. I think to myself, “You know what’s better than all of us at being the .NET runtime? The .NET runtime! So let’s ask it.”

After the presentation, I went out for a bite to eat with some friends. There, one of them made an excellent point. When we were discussing the idea of sitting around and figuring out what code did, he said (paraphrased), “if I show code to a handful of competent developers and they can’t agree on what it will do at runtime, then that code needs to be re-written.” I thought this was perfect and couldn’t agree more. The combination of letting the runtime stick to figuring out what things will be at runtime and the idea that well-written code shouldn’t make this a mystery really sort of drove home why this whole exercised seemed to me (and really is) purely an academic thought-drill.

By all means, exercise your brain and solve riddles involving programming. But I don’t think that this type of activity should be the centerpiece of work-place conversations, evaluations of code, or especially interviews. If I’m interviewing people and asking them what X is on line 48, before even the guy who gets it right, I’m going to hire the guy that says, “let’s write a unit test and find out.”

Jun
26

By Erik Dietrich

Please Don’t Recycle Local Variables

Category: Language Agnostic Tags: C#, OOP, Software Engineering | 18 Comments

I think there’s a lot of value to the conservation angle of the green movement. In general, it’s a matter of efficiency–if you can heat/light/whatever your house with the same quality of life, using less energy and fewer resources, that’s a win for everyone. This applies to a whole lot of things beyond just eco-concerns, however. Conserving heat when you’re cold, conserving energy when you’re running a marathon, conserving your dollars when making a budget–all good ideas. Cut down, conserve, reuse when you can.

Except please don’t do it with your local variables. For example:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Here, we initialize a local variable, count, and use it to keep track of the results of some processing of customers. When we’re done, we reset count and use it to keep track of the apparently unrelated concept of machines. What I’m saying is that there shouldn’t be just one count, but rather customerCount and machineCount.

Does this seem like nitpicking? You could certainly make that argument, but this code is not going to age well. First of all, this method should clearly be two methods, so we’re starting right off the bat with a bit of technical debt. It would be cleaner if each loop had its own method.

But an interesting thing happens if we use the refactoring tools to try to do that–the refactoring tool wants return values or input parameters. Yikes, that was unexpected, so we just move on. Later, when the time comes to iterate over movies, we see that there’s a ‘design pattern’ in place, so we modify the code to look like this:

public void DoSomeStuff()
{
    int count = 0;

    foreach (var customer in Customers)
    {
        DoSomeStuffToCustomer(customer);
        if (customer.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var machine in Machines)
    {
        DoSomeStuffToMachine(machine);
        if (machine.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
    count = 0;

    foreach (var movie in Movies)
    {
        DoSomeStuffToMovie(movie);
        if (movie.DidTheRightStuffHappen)
            count++;
    }

    Console.WriteLine(count);
}

Now this thing should really be split up, so we start selecting parts of it to see what we can refactor. Ew, now we’re getting ref parameters to boot. This thing is getting even more painful to try to refactor, and we’re in a hurry, so no time for that. And to make matters worse, if you add in a few other aggregator variables this way, you’ll start to have all kinds of barriers in place when you want to pull this thing apart, such as crazy sets of out parameters. I’ve posted before about how I feel about ref and out.

All of this mounting technical debt could easily be avoided by giving each loop its own count variable. Having them recycle the same one creates a compile-time dependency of what’s going on in each loop with what happened in the loop before, even though there are no other similar dependencies in evidence. In other words, recycling this local variable is the only thing that’s creating a coupling in your code–there’s no logical reason to do it.

This is the height of procedural programming and baking in temporal dependencies that I cautioned you to avoid here. It’s a completely useless dependency that will inhibit refactoring and dirty up your code in a hurry. It may not seem like much yet, but this will be a huge pain point later as the lines of code in this method balloon from the dozens to the hundreds, and you rely heavily on automated tools to help with cleanup. Flag variables used over and over in sequence throughout a method are like pebbles in your shoe when you’re trying to refactor.

So my advice is to avoid this practice completely. There’s really no advantage to coding this way and the potential downside is enormous.

Jun
24

By Erik Dietrich

Introduction to Unit Testing Part 4: Design New Code For Testability

Category: Language Agnostic Tags: C#, Starting To Unit Test, Unit Testing | 11 Comments

In the last post in this series, I covered what I think of as an important yet seldom discussed subject: how not to overwhelm yourself and get discouraged when you’re starting to unit test. In the post before that, I showed the basics of writing a unit test. But this leaves something of a gap. You can now write a unit test in a vacuum for an extremely simple class, and, when looking at your legacy code base, you know what to avoid. But you don’t necessarily how to write a non-trivial application with unit tests.

You might understand now how to write tests, assuming that you have some disconnected new class in your code base without dependencies and barriers to testing, but you wonder how to get to that point in the first place. I mean, your application doesn’t seem to need a prime number finder or a bowling score calculator. It needs you to add lines of code to existing methods or new methods to existing classes. It doesn’t really seem to need new classes, so the initial momentum and resolution you’ve built reading these first three posts to go and be a unit tester sort of fizzles anticlimactically when you stare at your code base.

What’s going on here?

Recognizing Inhospitable Terrain

The first thing to understand is that your code base probably wasn’t written with testability in mind. There’s nothing wrong with you for not being able to see where unit testing fits in, because it doesn’t. Last time I talked about things you’ll see that will torpedo your efforts to test a particular method or piece of code, but let me speak to some entire architectures and patterns that don’t lend themselves to testability. It’s not that code written with these technologies and patterns can’t be tested–it’s just that it won’t be easy for you. As you read through this, you might feel like I’ve read your code base’s mind.

Active Record as an architectural pattern
This is a pattern in which you create classes that are in-memory representations of database tables, views, or stored procedures. If you see in your code base a class called “Customer” that has methods like “GetById()”, “Update()” and “MoveNext()” you’ve got yourself an Active Record architecture. This architecture tightly couples your database to your domain logic and your domain logic to the rules for navigating through domain objects. You can’t test any of these objects since any operation you perform on them sends them scurrying off to create database connections and parameterized queries and all manner of other untestable stuff. And since decomposition and decoupling is the path toward unit testing, this sort of tight coupling of everything in your code is the path away from it.
Winforms
Winforms in the .NET world are tried and true when it comes to rapidly cranking out functional little applications, but you have to work really, really hard to make code that uses them testable. Q&A sites are littered with people trying to understand how to make Winforms testable, which should tell you that making them testable is not trivial. If you have Winforms and Active Record both in the same code base, at least the architecture is split into two concerns. But it’s split into two thoroughly untestable ones.
Webforms
See Winforms. Webforms is very similar in terms of framework testability, and for pretty similar reasons. Webforms is arguably even harder to test, however, because it’s predicated on spewing out reams and reams of HTML, CSS, and Javascript while allowing you to pretend you’re writing a desktop app. I’ve talked about my opinion of this technology before.
Wizard/markup-reliant code
Do you use the Webforms grid wizard thing to generate your grids? Do you define object data sources, such as DB connections or files, in the markup? Practices like these are the epitome of quick and dirty, rapid-prototyping implementations that hopelessly cross couple your applications beyond all testability. If this is something that’s done in your group/code base, testing is basically a non-starter until you go in a different architectural direction.
Everything in your application is in a user control/form
I’ve seen this called “Smart UI” and it basically means that there’s absolutely no separation of concerns in your code. The UI elements create database connections, write to files, implement business rules–they do everything. Code like this is impossible to unit test.

If any of this is sounding familiar, your task might be daunting. I have my own preferences, but I’m trying not to offer a value judgment here as much as I’m letting you know what you’re up against. I’m like a mortgage broker that’s saying to you, “if you want to own a home, that’s a great goal. But if you are eight months behind on your rent and have no personal savings, you’re going to have some work to do first.” If I’ve described your code base in the list above, you face different challenges than a green field developer. And since you’ve presumably been contributing to these code bases, you’re probably very used to implementation techniques that don’t result in testable code. You’re going to need to change your thinking and your coding practices in order to start writing testable code.

Once we’ve discussed how to get you writing testable code, I’ll come back to these macroscopic concerns and give some pointers for how to improve the situation. But, for now, on to a revised approach to coding. The following holds true whether you’re banging away at some legacy Winforms/Active Record application or starting a brand new MVC 4 site.

Add New Methods and Classes First, Ask Questions Later

First thing to abide by is to favor adding new things to the code over modifying existing things in the code. You may have heard this before in the context of the “Open/Closed Principle”, but that’s a guide for how to write your classes. (Basically, it admonishes you to write classes that others can extend and override rather than change.) I’m talking about how to deal with existing code bases. To put it simply and bluntly, it’s a lot easier to both code and test brand spankin’ new classes than to write and test changes to existing ones. We all know this. It’s at the heart of why we as developers always lean toward rewriting others’ code instead of understanding and working with it.

Now, this might not win you friends. In shops where people tend to write procedural code (you know, the kind you’re trying to get away from writing), they seem to have some weird fear of creating too many classes and masochistic attachment to monolithic structures. You might have to compromise or practice on your own if you run afoul of the project’s architect, but the exercise is invaluable. It’s going to propel you toward decoupling as a default rather than an exception. Doing new things? Time for a new class and some unit tests.

I know what you’re thinking: “But what if the thing that needs to be done has to be done in the middle of some method somewhere?” Well, instantiate your new class at that point and use it. “But what if it needs a bunch of fields from the class it’s in and variables from the method?” Pass them in through the constructor or method call. “But won’t that make my design bad?” It already is bad, but at least now you’re making part of it testable. When your code is testable and under test, everything is easier to fix later.

You’ll have to use some discretion, obviously, but shift your attitude here. Don’t look at a project that has nothing but .aspx files and their code behind and hide classes in there for fear of breaking with tradition. Boldly add pure .cs files to the code base. Add unit tests for those .cs files you’re creating. (Apologies to Java readers, but this has no real Java equivalent that I can think of having used. Plus, the Java stack in general seems not to have the same level of untestable cruft built into frameworks.) It’s a lot harder for someone, even the project architect, to give you a hard time if your new way of doing things is covered by unit tests. Even if they’re hostile Expert Beginners, they’re hard pressed not to sound silly if they say, “we don’t do that here.” You’ll at least have a better chance of pushing this change through with the unit tests than without them.

Ask Questions in the Right Order: What, How, When

Now that you’re creating a lot of new classes and instantiating them in the old untestable ones, it’s time to start working on what kind of code you write. If you’ve practiced and come back, I suspect that you’re starting to be able to write a few useful tests but are perhaps still struggling. And I bet it’s because the line between where the old class ends and your new one begins is a little hazy. Maybe they share some common fields. Maybe when you instantiate the class you’re testing, you hand it a “this” reference so that it can go picking through the properties on the untestable behemoth from which you’re escaping. This is the next thing we need to tighten up–stop doing that. A clear, concise division of labor between the classes is necessary, and it’s not possible if they share all of the same fields, properties, state, etc. That’s like a break up where you two continue to live together, share a car, and go to the movies on weekends.

The best way to achieve this clean split is with good abstraction, and the best way to do that is to remember “what, how, when.” When it’s time to change the code base, remember that you want to favor creating a new class. But before you do that, ask yourself “what?” but not the other two questions. What should I name this class? What should it do? What should it expose as its public methods? Don’t start thinking about how those methods or classes should work and don’t you dare start thinking about when anything at all should happen–just think about what. Give it a good name that defines a clear purpose, and then give it a good set of methods and properties that draw attention to why it’s a different concept than the class that will be using it. Ask yourself what the boundaries between the two classes will be so that you can minimize the amount of shared information. And now, start stubbing out those methods with no implementation and start stubbing out some test methods with names that say what the methods will do.

At this point, you’re ready for “how.” Start actually implementing the “what” and testing that your implementation works in the unit tests. Believe me, it’s much, much easier to implement methods this way. If you think about “what” and “how” at the same time, you start writing confusing code that not even you have faith in when you’re done. Implementation alone is much easier when you have a clear picture of “what,” and unit testing is a breeze.

Once everything is implemented, you can start thinking about “when.” When should you instantiate the class and when should you call its methods? But don’t spend too much time with “when” in your head because it’s dangerous. Does that sound weird? Let me explain.

“When Code” is Monolithic Code

Picture two methods. One is a 700 line juggernaut with more control flow statements than you can keep track of without a spreadsheet. The other is five lines long, consisting only of an initialize statement, a foreach, and a return statement. With which would you rather work? I imagine the response is unanimous here. Even if you tend to crank out these kinds of large methods, when you step through the debugger, looking for the cause of a bug and finding yourself in some huge method, your heart sinks and you settle in with snacks and caffeine because it’s going to be a long day.

Now with these two methods in mind, imagine if we were pair programming together and I simply asked “when?” With the tiny method, you’d probably say, “What do you mean by ‘when’–I mean, you initialize before the loop and you return when you find the record you’re looking for. What a weird question!” With the other method, you’d probably affect a thousand-yard stare and say, “Man, I don’t even know where to begin.” But the answer to that question would fill pages. Books. Because you set the first loop counter j equal to the third loop counter k about twenty lines before the fourth try-catch and fifteen lines before you set the middle loop counter j equal to four. Unless, of course, you threw that exception up on line 2090, in which case j might never have been initialized. Er, wait, I think that happened somewhere near the fifth while loop in that else condition up there. Oh, there’s so much “when,” but it’s all slammed together in a method where you can’t possibly test any of it. Lots of thinking about “when” breeds huge methods like a Petri dish for bacteria.

“When” code is procedural at its core, and procedural, “when” code is an anathema to object-oriented unit testing, which is all about “what” and “how.” Remember earlier in the series when I said that multi-threaded code was really, really hard to test? Well, that’s just a subset of an idea called “temporal coupling,” and what we’re talking about here also falls under that umbrella. Temporal coupling is what happens when things have to be executed in a specific order or else they do not work.

Imagine that you’re coding up a model for someone’s day. When you think of how to do this, do you think, “first he gets up, then he brushes his teeth, then he showers, then he puts on his clothes, then… then he comes home, then he eats dinner, then he watches TV, then he goes to bed?” Do you code this up with constructs like:

public void GoAboutMyDay()
{
    bool wokeUp = WakeUp();
    if (wokeUp == true)
    {
        bool showered = Shower();
        if (showered == true)
        {
            bool putOnClothes = PutOnClothes();
            if(putOnClothes)
                //You get the idea
        }
    }
}

This method is all about “when.” It’s entirely procedural, and it’s going to be horrifying when it’s complete. When you get into the fortieth nested if condition, maybe someone will come along and flatten it out with a bunch of inverted early returns. Or maybe not, because maybe some of if clauses start sprouting else conditions with loops in them. And maybe the methods being called start communicating with one another via boolean flag fields in the class. Who knows–this thing is on the precipice of becoming unstoppable. It might just achieve sentience at some point, so that when you try to start deleting conditionals, it says, “I can’t let you do that, Dave,” and puts them back.

The root problem behind it all is the “when” and the procedural thinking because you’re orienting the implementation around the order of the activities rather than the nature of the activities. Unit testing is all about deconstructing things into their smallest possible chunks and asserting things about those chunks. Temporal coupling and “when” logic is all about chaining and fusing things together.

If you were thinking about “what” first here, you would form a much different mental model of a person’s day. You’d say things to yourself like, “well, during the course of a person’s day, he probably wakes up, gets dressed, eats breakfast–well, actually eats one or more meals–maybe works if it’s a weekday, goes to bed at some point,” etc. Whereas in the procedural “when” modeling you were necessarily building a juggernaut method, here you’re dreaming up the names of methods and/or classes that can be unit tested separately and in isolation. It’s no reach to say, “okay, let’s have a Meal class that will have the following methods…”

Only at the end will you decide “when.” You’ll decide it after you’ve stubbed things out with the “what” and implemented/unit tested them with the “how.” “When” is a detail that you should allow yourself to figure out at any point down the line. If you nail down what and how, you will have testable, modular, and manageable code as you create your classes.

Other Design Considerations for Your New Classes

I’ll wrap up here with a few additional tips for creating testable designs when adding code-to-code bases:

Avoid using fields to communicate between your methods by setting flags and tracking state. Favor having methods that can be executed at any time and in any order.
Don’t instantiate things in your constructor. Favor passing them in (we’ll talk about this in detail in a future post in the series).
Similarly, don’t have a lot of code or do a lot of work in your constructor. This will make your class painful to setup for test.
In your methods, accept parameters that are as decomposed as possible. For instance, don’t accept a Customer object if all you do with it is read its SSN property. In that case, just ask for the SSN.
Avoid writing public static methods. These are easy enough to test (often), but they start introducing testability problems when you write code that uses them. (This might be hard to swallow at first, but mull over the idea of simply not using static methods anymore.)
The earlier you start writing your unit tests, the better. If you find that you’re having a hard time testing your new code, it’s more likely a problem with the code than with unit testing it, and if you write tests early, you’ll discover these problems before you get too far and fix them.

This post has covered ways to write unit tests “from here forward” and ways to stop adding untested code to code bases. In the next post, I’ll talk about how to start getting the legacy code under test.

Addendum: Mitigating the Hostile Test Environments

Finally, as promised, here are ways to accommodate testing in the less-than-ideal architectures mentioned above, if you’re curious or want to do some more research:

Instead of Active Record, look at some kind of ORM solution like NHibernate or Entity Framework. These are tools that generate all of the code for you to access the database so that you don’t have to worry about testing that code and you can focus on writing only your (testable) domain code. Barring that, try to separate the three concerns of Active Record objects: modeling the database, connecting to the database, and modeling a domain object. The first concern adds no value, and the second two can be broken out into separate objects where the only thing hard to test is the actual database access.
Instead of Winforms, favor WPF when possible. If that isn’t possible, see if you can use the Model-View-Presenter (MVP) pattern to move as much logic out of the untestable code-behind as possible.
To be blunt, from a testing/decoupling perspective, Webforms is a disaster. You can have some limited success by adopting a more passive binding model and moving as much code out of the code-behind as possible, but it’s all pretty awkward. Webforms really seems more about rapid-prototyping and Microsoft-Accessing web development than producing scalable, sophisticated architectures.
If you’re using wizards to generate your application’s architecture, cut it out. If you’re defining implementation details in markup, cut it out. Markup is for layout, not unit-testable business logic or state logic. If you depend on definitions in markup to drive your application’s behavior, you’re relying exorbitantly on a third-party framework, which is always extremely brittle from a testability perspective.
To fix Smart UI, you just have to factor toward a more decoupled architecture. Start pulling different concerns out of the user controls and forms and finding a home for them.

Jun
21

By Erik Dietrich

Born to Exclude: Beware of Monoculture

Category: The Life of a Programmer Tags: Metaphors for Software, Soft Skills |

Understanding the Idea of Monoculture

One morning last week, I was catching up on backlogged podcasts in my Doggcatcher feed on the way to work and was listening to an episode of Hanselminutes where Scott Hanselman interviewed a front end web developer named Garann Means. The subject of the talk was what they described as “developer monoculture,” and they talked about the potentially off-putting effect on would-be developers of asking them to fill a sort of predefined, canned “geek role.”

In other words, it seems as though there has come to be a standard set of “developer things” that developers are expected to embrace: Star Trek, a love of bacon (apparently), etc. Developers can identify one another in this fashion and share some common ground, in a “talk about the weather around the water cooler” kind of way. But this policy seems to go beyond simply inferring a common set of interests and into the realm of considering those interests table stakes for a seat at the geek table. In other words, developers like fantasy/sci-fi, bacon, that Big Bang Theory show (I tried to watch this once or twice and found it insufferable), etc. If you’re a real developer, you’ll like those things too. This is the concept of monoculture.

The podcast had weightier issues to discuss that the simple “you can be a developer without liking Star Trek,” though. It discussed how the expected conformance to some kind of developer archetype might deter people who didn’t share those interests from joining the developer community. People might even suppress interests that are wildly disparate from what’s normally accepted among developers. Also mentioned was that smaller developer communities such as Ruby or .NET trend even more toward their own more specific monoculture. I enjoyed this discussion in sort of an abstract way. Group dynamics and motivation is at least passingly interesting to me, and I do seem to be expected to do or like some weird things simply because I write software for a living.

At one point in the discussion, however, my ears perked up as they started to discuss what one might consider, perhaps, a darker side of monoculture–that it can be deliberately instead of accidentally exclusionary. That is, perhaps there’s a point where you go from liking and including people because you both like Star Trek to disliking and excluding people because they don’t. And that, in turn, leads to a velvet rope situation: not only are outsiders excluded, but even those with similar interest are initially excluded until they prove their mettle–hazing, if you will. Garann pointed out this dynamic, and I thought it was insightful (though not specific to developers).

From there, they talked about this velvet-roping existing as a result of people within the inner sanctum feeling that they had some kind of ‘birthright’ to be there, and this is where I departed in what had just been general, passive agreement with the points being made in the podcast. To me, this characterization was inverted–clubhouse sitters don’t exclude and haze people because they of a tribal notion that they were born into an “us” and the outsiders are “them.” They do it out of insecurity, to inflate the value of their own experiences and choices in life.

A Brush with Weird Monoculture and what it Taught Me

When I first met my girlfriend, she was a bartender. As we started dating, I would go to the bar where she worked and sit to have a beer, watch whatever Chicago sports team was playing at the time, and keep her company when it was slow. After a while, I noticed that there was a crowd of bar flies that I’d see regularly. Since we were occupying the same space, I made a few half-hearted efforts to be social and friendly with them and was rebuffed (almost to my relief). The problem was, I quickly learned, that I hadn’t logged enough hours or beers or something to be in the inner circle. I don’t think that this is because these guys felt they had a birthright to something. I think it’s because they wanted all of the beers they’d slammed in that bar over the course of decades to count toward something besides cirrhosis. If they excluded newbies, youngsters, and non-serious drinkers, it proved that the things they’d done to be included were worth doing.

So why would Star-Trek-loving geeks exclude people that don’t know about Star Trek? Well, because they can, with safety in numbers. Also because it makes the things that they enjoy and for which they had likely been razzed in their younger days an asset to them when all was said and done. Scott talked about being good at programming as synonymous with revenge–discovering a superpower ala Spiderman and realizing that the tables had turned. I think it’s more a matter of finding a group that places a radically different value on the same old things that you’ve always liked doing and enjoying that fact. It used to be that your skills and interests were worthless while those of other people had value, but suddenly you have enough compatriots to reposition the velvet rope more favorably and to demonstrate that there was some meaning behind it all. Those games of Dungeons and Dragons all through high school may have been the reason you didn’t date until twenty, but they’re also the reason that you made millions at that startup you founded with a few like-minded people. That’s not a matter of birthright. It’s a matter of desperately assigning meaning to the story of your life.

Perhaps I’ve humanized mindless monoculture a bit here. I hope so, because it’s essentially human. We’re tribal and exclusionary creatures, left to our baser natures, and we’re trying to overcome that cerebrally. But while it may be a sympathetic position, it isn’t a favorable or helpful one. We can do better. I think that there are two basic kinds of monoculture: inclusive, weather-conversation-like inanity monoculture (“hey, everyone loves bacon, right?!?”) and toxic, exclusionary self-promotion. In the case of the former, people are just awkwardly trying to be friendly. I’d consider this relatively harmless, except for the fact that it may inadvertently make people uncomfortable here and there.

The latter kind of monoculture dovetails heavily into the kind of attitudes I’ve talked about in my Expert Beginner series of posts, where worth is entirely identity-based and artificial. I suppose I perked up so much at this podcast because the idea of putting your energy into justifying why you’ve done enough, rather than doing more, is fundamental to the yet-unpublished conclusion of that series of posts. If you find yourself excluding, deriding, hazing, or demanding dues-paying of the new guy, ask yourself why. I mean really, critically ask yourself. I bet it has everything to do with you (“I went through it too,” and, “it wouldn’t be fair for someone just to walk in and be on par with me”) and nothing to do with him. Be careful of this, as it’s the calling card of small-minded mediocrity.

DaedTech

Proposal: A Law of Performance Citation

Caveats

I Don’t Know What X is on Line 48 And I Don’t Care

Please Don’t Recycle Local Variables

Introduction to Unit Testing Part 4: Design New Code For Testability

Recognizing Inhospitable Terrain

Add New Methods and Classes First, Ask Questions Later

Ask Questions in the Right Order: What, How, When

“When Code” is Monolithic Code

Other Design Considerations for Your New Classes

Addendum: Mitigating the Hostile Test Environments

Born to Exclude: Beware of Monoculture

Understanding the Idea of Monoculture

A Brush with Weird Monoculture and what it Taught Me

About Me

Developer Hegemony

Search the Site

Post Archives By Month