DaedTech

Stories about Software

By

Flexibility vs Simplicity? Why Not Both?

Don’t hard code file paths in your application. If you have some log file that it’s writing to or some XML file that it’s reading, there’s a well established pattern for how to keep track of the paths of those files: an external configuration scheme. This might be a .config file or a settings.xml file or even a yourapp.ini file if you’re gray enough in the beard. Or, perhaps it’s something more exotic like a database table or web service that stores key value configuration pairs. Maybe it’s something as simple as command line parameters that specify the path. Whatever the case may be, everyone knows that you don’t hard code — you don’t store the file path right in the source code. That’s amateur hour.

You can imagine how this began. Maybe a long time ago someone said, “hey, let’s log critical application events to a file so that we can review and troubleshoot if things go wrong.” They shipped this for some machine running Windows 3.1 or something, and were logging to C:\temp, which was fine unless users didn’t have a C:\temp directory. In that case, it blew up spectacularly and they were flooded with support calls at which point, they could tell their users to create the directory or they could ship a new set of floppy disks with the new source code, amended to log to a directory guaranteed to exist. Or something like that, anyway.

The lesson couldn’t be more obvious. If they had just thought ahead, they would have realized their choice for the path of the log file, which isn’t even critical anyway, was a poor one. It would have been good if they had chosen better, but it would have been almost as good if they’d just made this configurable somehow so that it needn’t be a disaster. They could have made the path configurable or they could have just made it a configurable option to create C:\temp if it didn’t exist. Next time, they’d do better by building flexibility into the application. They’d create a scheme where the application was flexible and thus the cost of not getting configuration settings right up-front was dramatically reduced.

This approach made sense and it became the norm. User settings and preferences would be treated as data, which would make it easy to create a default experience but to allow users to customize it if they were sufficiently sophisticated. And the predecessor to the “Advanced” menu tab was born. But the other thing that was born was subtle complexity, both for the users and for the programmers. Application configurability is second nature to us now (e.g. the .NET .config file), but make no mistake — it is a source of complexity even if you’re completely used to it. Think of paying $300 per month for all of your different telco concerns — the fact that you’ve been doing this for years does’t mean you’re not shelling out a ton of money.

What’s even more insidious is how this mentality has crept into application development in other ways. Notice that I called this practice “preferences as data” rather than “future-proofing,” but “future-proofing” is the lesson that many took away. If you design your application to be flexible enough, you can insulate yourself against bad initial guesses about user preferences or usage scenarios and you can ensure that the right set of tweaks, configuration alterations, and hacks will allow users to achieve what they want without you needing to re-deploy.

So, what’s the problem, apart from a huge growth in the number of available settings in a config file? I’d argue that the problem is the subtle one of striving for configurability as a first class goal. Rather than express this in general, definition-oriented terms, consider an example that may be the logical conclusion to take this thinking as far as it will go. You have some method that you expose publicly, called, ProcessOrder and it takes a parameter. Contrary to what you might think, the parameter isn’t an order ID and it isn’t an order: it’s an object. Why? Because this API is endlessly flexible. The method signature will suffice even if the entire order processing mechanism changes and even if the structure of the order itself changes. Heck, it won’t need to be altered if you decide to alter ProcessOrder(object order) to send emails. Just pass in an “Email” object and add a check for typeof(Email) to ProcessOrder. Awesome, right?

SwissArmy

Yeah, ugh. Flexibility run amok. It’d be easy to interpret my point thus far as “you need to find the balance between inflexibility/simplicity on one end and flexibility/complexity on the other.” But that’s a consultant answer at best, and a non-point at worst. It’s no great revelation that these tradeoffs exist or that it’d be ideal if you could understand which trait was more valuable in a given moment.

The interesting thing here is to consider the original problem — the one we’ve long since file away as settled. We shipped a piece of software with a setting that turned out to be a mistake, so what lesson do we take away from that? The lesson we did take away was that we should make mistakes less costly by adding a configurability out. But what if we made the mistake less costly by making rollouts of the software trivial and inexpensive? Imagine a hypothetical world where rollout didn’t mean shipping a bunch of shrink-wrapped boxes with floppy disks in them but rather a single mouse click and high confidence that everything would go well. If this were a reality, hard-coding a log file path wouldn’t really be a big deal because if that path turned out to be a problem, you could just alter that source code file, click a button, and correct the mistake. By introducing and adjusting a previously unconsidered variable, you’ve managed to achieve both simplicity and flexibility without having to trade one for the other.

The danger for software decision makers comes from creating designs with the goal of satisfying principles or interim goals rather than the goal of solving immediate business problems. For instance, the problem of hard-coding tends to arise from (generally inexperienced) software developers optimizing for their own understanding and making code as procedurally simple as possible — “hardcoding is good because I can see right where the file is going when I look at my code.” That’s not a reasonable business goal for the software. But the same problem occurs with developers automatically creating a config file for application settings — they’re following the principle of “flexibility” rather than thinking of what might make the most sense for their customers or their situation. And, of course, this also applies to the designer of the aforementioned “ProcessOrder(object)” API. Here the goal is “flexibility” rather than something tangible like “our users have expressed an interest in changing the structure of the Order object and we think this is a good idea and want to support them.”

Getting caught up in making your code conform to principles will not only result in potentially suboptimal design decisions — it will also stop you from considering previously unconsidered variables or situations. If you abide the principle “hard-coding is bad,” without ever revisiting it, you’re not likely to consider “what if we just made it not matter by making deployments trivial?” There is nothing wrong with principles; they make it easy to communicate concepts and lay the groundwork for making good decisions. But use them as tools to help you achieve your goals and not as your actual goals. Your goals should always be expressible as humans interacting with your software — not characteristics of the software.

By

Dependency Injection or Inversion?

The hardest thing about being a software developer, for me, is coming up with names for things. I’ve worked out a system with which I’m sort of comfortable where, when coding, I pay attention to every namespace, type, method and variable name that I create, but in a time-box (subject to later revisiting, of course). So I think about naming things a lot and I’m usually in a state of thinking, “that’s a decent name, but I feel like it could be clearer.”

And so we arrive at the titular question. Why is it sometimes called “dependency injection” and at other times, “dependency inversion.” This is a question I’ve heard asked a lot and answered sometimes too, often with responses that make me wince. The answer to the question is that I’m playing a trick on you and repeating a question that’s flawed.

Confused

Dependency Injection and Dependency Inversion are two distinct concepts. The reason that I led into the post with the story about naming is that these two names seem fine in a vacuum but, used together, they seem to create a ‘collision,’ if you will. If I were wiping the slate clean, I’d probably give “dependency inversion” a slightly different name, though I hesitate to say it since a far more accomplished mind than my own gave it the name in the first place.

My aim here isn’t to publish the Nth post exhaustively explaining the difference between these two concepts, but rather to supply you with (hopefully) a memorable mnemonic. So, here goes. Dependency Injection == “Gimme it” and Dependency Inversion == “Someone take care of this for me, somehow.” I’ll explain a bit further.

Dependency Injection is a generally localized pattern of writing code (though it may be used extensively in a code base). In any given method or class (or module, if you want) rather than you going out and finding or making the things you need, you simply order your collaborators to “gimme it.”

So instead of this:

public Time getTheTime() {
	ThingThatTellsTime atom = new CesiumAtom();
	return atom.getMeTheTimeSomehow();
}

You say, “nah, gimme it,” and do this instead:

public Time getTheTime(ThingThatTellsTime whatever) {
	return whatever.getMeTheTimeSomehow();
}

It isn’t you responsible for figuring out that time comes from atomic clocks which, in turn, come from atoms somehow. Not your problem. You say to your collaborators, “you want the time, Buddy? I’m gonna need a ThingThatTellsTime, and then it’s all yours.” (Usually you wouldn’t write this rather pointless method, but I wanted to keep the example as simple as humanly possible).

Dependency Inversion is a different kind of tradeoff. To visualize it, don’t think of code just yet. Think of a boss yelling at a developer. Before the ‘inversion’ this would have been straightforward. “Developer! You, Bill! Write me a program that tells time!” and Bill scurries off to do it.

But that’s so pre-Agile. Let’s do some dependency inversion and look at how it changes. Now, boss says, “Help, someone, I need a program that tells time! I’m going to put a story in the product backlog” and, at some point later, the team says, “oh, there’s something in the backlog. Don’t know how it got there, exactly, but it’s top priority, so we’ll figure out the details and get it done.” The boss and the team don’t really need to know about each other directly, per se. They both depend on the abstraction of the software development process; boss has no idea which person writes the code or how, and the team doesn’t necessarily know or care who plopped the story in the backlog. And, furthermore, the backlog abstraction doesn’t depend on knowing who the boss is or the developers are or exactly what they’re doing, but those details do depend on the backlog.

Okay, so first of all, why did I do one example in code and the other in anecdote, when I could have also done a code example? I did it this way to drive home the subtle scope difference in the concepts. Dependency injection is a discrete, code-level tactic. Dependency inversion is more of an architectural strategy and way of structuring (decoupling) code bases.

And finally, what’s my (mild) beef with the naming? Well, dependency inversion seems a little misleading. Returning to the boss ordering Bill around, one would think a strict inversion of the relationship would be the stuff of inane sitcom fodder where, “aha! The boss has become the bossed! Bill is now in charge!” Boss and Bill’s relationship is inverted, right? Well, no, not so much — boss and Bill just have an interface slapped in between them and don’t deal with one another directly anymore. That’s more of an abstraction or some kind of go-between than an inversion.

There was certainly a reason for that name, though, in terms of historical context. What was being inverted wasn’t the relationship between the dependencies themselves, but the thinking (of the time) about object oriented programming. At the time, OOP was very much biased toward having objects construct their dependencies and those dependencies construct their dependencies, and so forth. These days, however, the name lives on even as that style of OOP is more or less dead this side of some aging and brutal legacy code bases.

Unfortunately, I don’t have a better name to propose for either one of these things — only my colloquial mnemonics that are pretty silly. So, if you’re ever at a user group or conference or something and you hear someone talking about the “gimme it” pattern or the “someone take care of this for me, somehow” approach to architecture, come over and introduce yourself to me, because there will be little doubt as to who is talking.

By

Rapid Fire Craftsmanship Tips

The last month has been something of a transitional time for me. I had been working out of my house for a handful of clients pretty much all summer, but now I’ve signed on for a longer term engagement out of state where I’m doing “craftsmanship coaching.” Basically, this involves the lesser-attended side of an agile transformation. There is no shortage of outfits that say, “hey, sign up with us, get a certification and learn how to have meetings differently,” but there does seem to be a shortage of outfits that say, “we’ll actually teach you how to write code in a way that makes delivering every couple of weeks more than a pipe dream.” I believe this state of affairs leads to what has been described as “flaccid scrum.” So my gig now is going to be working with a bunch of developers on things like writing modular code, dependency inversion, test driven development, etc.

This background is relevant for 2 reasons. First of all, it’s my excuse for why my posting cadence has dipped. Sorry ‘bout that. Secondly, it explains and segues into this post. What is software craftsmanship, anyway? I’m apparently teaching it, but I’m not really sure I can answer this question other than to say that I share a lot of opinions about what it means to write code effectively with people who identify this way. I think that TDD, factored methods, iterative, high-communication approaches, failing early, and testable code constitute are efficient approaches to writing software, and I’m happy to help people who want to improve at these things as best I can.

In that vein of thought, I’d like to offer some suggestions for tangible and easy-to-remember/easy-to-do things that you can do that are likely to improve your code. Personally, more than anything else, I think my programming was improved via random suggestions like this that were small by themselves, but in aggregate added up to a huge improvement. So, here is a series of things to tuck into your toolbelt as a programmer.

Make your variable names conversational

ComboBox cb = new ComboBox();

Ugh. The only thing worse than naming the variable after its type is then abbreviating that bad name. Assuming you’re not concerned with shaving a few bytes off your hard disk storage, this name signifies to maintainers, “I don’t really know what to call this because I haven’t given it any thought.”

ComboBox dayPicker = new ComboBox();

Better. Now when this thing is referenced elsewhere, I’ll know that it probably contains days of some sort or another. They may be calendar days or days of the week, but at least I know that it’s talking about days, which is more than “cb” told me. But what about this?

ComboBox mondayThroughFridayPicker = new ComboBox();

Any doubt in your mind as to what’s in this combo box? Yeah, me neither. And that’s pretty handy when you’re reading code, especially if you’re in some code-behind or any kind of MVC model-binding scheme. And, of the objections you might have, modern IDE’s cover a lot of them. What if you later want to add Saturday and Sunday and the name becomes out of date? Easy to change now that just about all major IDEs have “rename all” support at your fingertips. Isn’t the name a little over-descriptive? Sure, but who cares — it’s not like you need to conserve valuable bytes of disk space. But with cb name, you know it’s a combo box! Your IDE should give you that information easily and quickly and, if it doesn’t, get a plugin that tells you (at least for a statically typed language).

Try to avoid booleans as method parameters

This might seem a little weird at first, but, on the whole your code will tend to be more readable and expressive if you don’t do this. The reason for this is that boolean parameters are rarely data. Rather, they’re generally control parameters. Consider this method signature:

void LogOutputToFile(string output, bool useConsole = false)

This is a reasonably readable method signature and what you can infer from it is that the method is going to log output to a file. Well, unless you pass it “true”, in which case it will log to the console. And this tends to run afoul of the Single Responsibility Principle. This method is really two different methods kind of bolted together and its up to a caller to figure that out. I mean, you can probably tell exactly what this method looks like:

void LogOutputToFile(bool useConsole = false)
{
     if(useConsole)
          Console.WriteLine(output);
     else
          _someFileHandle.WriteLine(output);
}

This method has two very distinct reasons to change: if you want to change the scheme for console logging and if you want to change the scheme for file logging. You’ve also established a design anti-pattern here, which is that you’re going to need to update this method (and possibly callers) every time a new logging strategy is needed.

Are there exceptions to this? Sure, obviously. But my goal here isn’t to convince you never to use a boolean parameter. I’m just trying to get you to think twice or three times about doing so. It’s a code smell.

If you type // stop and extract a method

How many times do you see something like this:

_customerOrder.IsValid = true;
_logfile.WriteLine("Finished processing customer order");

//Now, save the customer order
if(_repository != null)
{
     try
     {
          _repository.Add(_customerOrder);
     }
     catch
     {
          _logfile.WriteLine("Oops!");
          _showErrorStateToUser = true;
          _errorMessage = "Oops!";
     }
}

Would it kill you to do this:

_customerOrder.IsValid = true;
_logfile.WriteLine("Finished processing customer order");

SaveCustomerOrder();

and put the rest in its own method? Now you’ve got smaller, more factored, and descriptive methods, and you don’t need the comment. As a rule of thumb, if you find yourself creating “comment bookmarks” in your method like a table of contents with chapters, the method is too big and should be factored. And what better way to divide things up than to stop typing a comment and instead add a method with a descriptive name? So, when you find you’ve typed that “//”, hit backspace twice, type the comment without spaces, and then slap a parenthesis on it and, viola, new method signature you can add.

Make variable name length vary with scope size

This seems like an odd thing to think about, but it will lead to clearer code that’s easier to read. Consider the following:

Globals.NonThreadSafeSumOfMostRecentlyProcessedCustomerCharges = 0;
for(int i = 0; i < _processedCustomers.Count(); i++)
     Globals.NonThreadSafeSumOfMostRecentlyProcessedCustomerCharges += _processedCustomers[i].TotalCharge;

Notice that there are three scopes in question: method level scope (i), class level scope (_processedCustomers) and global scope (that gigantic public static property). The method level scope variable, i, has a really tiny name. And, why not? It's repeated 4 times in 2 lines, but it's only in scope for 2 lines. Giving it a long name would clog up those two lines with redundancy, and it wouldn't really add anything. I mean, it's not hard to keep track of, since it goes out of scope one line after being defined.

The class level scope variable has a more descriptive name because there's a pretty good chance that its declaration will be off of your screen when you are using it. The extra context helps. But there's no need to go nuts, especially if you're following the Single Responsibility Principle, because the class will probably be cohesive. For instance, if the class is called CustomerProcessor, it won't be too hard to figure out what a variable named "_processedCustomers" is for. If you have some kind of meandering, 2000 line legacy class that contains 40 fields, you might want to make your class level fields more descriptive.

The globally scoped variable is gigantic. The reason for this is twofold. First and most obviously, it's in scope from absolutely anywhere with a reference to its containing assembly, so it better be very descriptive for context. And secondly, global state is icky, so it's good to give it a name that discourages people from using it as much as possible.

In general, the broader the usage scope for a variable/property, the more context you'll want to bake into its name.

Try to conform to the Principle of Least Surprise

This last one is rather subjective, but it's good practice to consider. The Principle of Least Surprise says that you should aim to minimize the learning curve or inscrutability of code that you write, bearing in mind a target audience of your fellow developers (probably your team, unless you're writing a more public API). As a slight caveat to this, I'd say it's fair to assume a reasonable level of language proficiency -- it might not make sense to write horribly non-idiomatic code when your team is likely to become more proficient later. But the point remains -- it's best to avoid doing weird or "clever" things.

Imagine stumbling across this bad boy that compares two integers... sort of:

public bool Compare(int x, int y)
{
    Console.WriteLine("You shouldn't see this in production.");
    _customerCount = x; 
    return x.ToString() == y.ToString();
}

What pops into your head? Something along the lines of, "why is that line about production in there?" Or maybe, "what does a comparison function set some count equal to one of the parameters?" Or is it, "why compare two ints by converting them to strings?" All of those are perfectly valid questions because all of those things violate the Principle of Least Surprise. They're surprising, and if you ask the original author about them, they'll probably be some weird, "clever" solution to a problem that came up somewhere at some point. "Oh, that line about production is to remind me to go back and change that method. And, I set customer count equal to x because the only time this is used it's to compare customer count to something and I'm caching it for later and saving a database write."

One might say the best way to avoid this is to take a break and revisit your code as if you're someone else, but that's pretty hard to do and I would argue that it's an acquired skill. Instead, I'd suggest playing a game where you pretend you're about to show this code to someone and make mental note of what you start preparing yourself to explain. "Oh, yeah, I had to add 39 parameters to that method -- it's an interesting story, actually..." If you find yourself preparing to explain something, it probably violates the Principle of Least Surprise. So, rather than surprising someone, maybe you should reconsider the code.

...

Anyway, that's all for the tips. Feel free to chime in if you have any you'd like to share. I'd be interested to hear them, and this list was certainly not intended to be exhaustive -- just quick tips.

By

Be Idiomatic

I have two or three drafts now that start with the meta of me apologizing for being sparse in my posting of late. TL;DR is that I’ve started a new, out of town engagement and between ramp-up, travel and transitioning off of prior commitments, I’ve been pretty bad at being a regular blogger lately.

Also, special apologies to followers of the Chess TDD series, but my wifi connection in room has just been brutal for the desktop (using the hotel’s little plugin converter), so it’s kind of hard to get one of those posts going. The good news is that what I’m going to be doing next involves a good bit of mentoring and coaching, which lends itself particularly well to vaguely instructional posts, so hopefully the drought won’t last. (For anyone interested in details of what I’m doing, check Linked In or hit me up via email/twitter).

TheCouncil

What It Is to Be Idiomatic in Natural Language

Anywho, onto a brief post that I’ve had in drafts for a bit, waiting on completion. The subject is, as the title indicates, being “idiomatic.”

In general parlance, idiomatic refers to the characteristic of speaking a language the way a native would. To best illustrate the difference between language correctness and being idiomatic, consider the expression, “go fly a kite.” If you were to say this to someone who had learned to speak flawless English in a classroom, that person would probably ask, “what kite, and why do you want me to fly it?”

If you said this to an idiomatic USA English speaker (flawless or not), they’d understand that you were using a rather bland imprecation and essentially telling them to go away and leave you alone. And so we make a distinction between technically accurate language (syntax and semantics) and colloquially communicative language. The idiomatic speaker understands, well, the idioms of the local speech.

Being Idiomatic in Programming Language

Applied to programming languages, the metaphor holds pretty well. It’s possible to write syntactically and semantically valid code (in the sense that the code compiles and does what the programmer intends at runtime) that isn’t idiomatic at all. I could offer all manner of examples, but I’ll offer the ones probably most broadly approachable to my reader base. Non-idiomatic Java would look like this:

public void DoSomething()
{
     System.out.print("Hello!");
}

And non-idiomatic C# would look like this:

public void doSomething() {
     Console.WriteLine("Hello!");
}

In both cases, the code will compile, run, and print what you want to print, but in neither case are you speaking it the way the natives would likely speak it. Do this in either case, and you’ll be lucky if you’re just laughed at.

Read More

By

Recall, Retrieval, and the Scientific Method

Improving Readability with Small Things

In my series on building a Chess game using TDD I’ve defined a value type called BoardCoordinate that I introduced instead of passing around X and Y coordinate integer primitives everywhere. It’s a simple enough construct:

public struct BoardCoordinate
{
    private readonly int _x;
    public int X { get { return _x; } }

    private readonly int _y;
    public int Y { get { return _y; } }

    public BoardCoordinate(int x, int y)
    {
        _x = x;
        _y = y;
    }

    public bool IsCoordinateValidForBoardSize(int boardSize)
    {
        return IsDimensionValidForBoardSize(X, boardSize) && IsDimensionValidForBoardSize(Y, boardSize);
    }

    private static bool IsDimensionValidForBoardSize(int dimensionValue, int boardSize)
    {
        return dimensionValue > 0 && dimensionValue <= boardSize;
    }
}

This was a win early on the series to get me away from a trend toward Primitive Obsession, and I haven't really revisited it since. However, I've found myself in the series starting to think that I want a semantically intuitive way to express equality among BoardCoordinates. Here's why:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Returns_1_2_For_1_1()
{
    Assert.IsTrue(MovesFrom11.Any(bc => bc.X == 1 && bc.Y == 2));
}

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Returns_2_2_For_1_1()
{
    Assert.IsTrue(MovesFrom11.Any(bc => bc.X == 2 && bc.Y == 2));
}

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Returns_3_3_For_1_1()
{
    Assert.IsTrue(MovesFrom11.Any(bc => bc.X == 3 && bc.Y == 3));
}

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Does_Not_Return_0_0_From_1_1()
{
    Assert.IsFalse(MovesFrom11.Any(bc => bc.X == 0 || bc.Y == 0));
}

This is a series of unit tests of the "Queen" class that represents, not surprisingly, the Queen piece in chess. The definition of "MovesFrom11" is elided, but it's a collection of BoardCoordinate that represents the possible moves a queen has from piece 1, 1 on the chess board.

This series of tests was my TDD footprint for driving the functionality of determining the queen's moves. So, I started out saying that she should be able to move from (1,1) to (1,2), then had her also able to move to (2,2), etc. If you read the test, what I'm doing is saying that this collection of BoardCoordinates to which she can move should have in it one that has X coordinate of 1 and Y coordinate of 2, for instance.

What I don't like here and am making mental note to change is this "and". That's not as clear as it could be. I don't want to say, "there should be a coordinate in this collection with X property of such and such and Y property of such and such." I want to say, "the collection should contain this coordinate." This may seem like a small semantic difference, but I value readability to the utmost. And readability is a journey, not a destination -- the more you practice it, the more naturally you'll write readable code. So, I never let my foot off the gas.

During the course of the series, this nagging readability hiccup has caused me to note and refer to a TODO of implementing some kind of concept of equals. In the latest post, Sten asks in the comments, referring to my desire to implement equals, "isn't that unnecessary since structs that doesn't contain reference type members does a byte-by-byte in-memory comparison as default Equals implementation?" It is this question I'd like to address here in this post.

Not directly, mind you, because the assessment is absolutely spot on. According to
MSDN:

If none of the fields of the current instance and obj are reference types, the Equals method performs a byte-by-byte comparison of the two objects in memory. Otherwise, it uses reflection to compare the corresponding fields of obj and this instance.

So, the actual answer to that question is simply, "yes," with nothing more to say about it. But I want to provide my answer to that question as it occurred to me off the cuff. I'm a TDD practitioner and a C# veteran, for context.

Answering Questions on Language Knowledge

My answer, when I read the question was, "I don't remember what the default behavior of Equals is for value types -- I have to look that up." What surprised me wasn't my lack of knowledge on this subject (I don't find myself using value types very often), but rather my lack of any feeling that I should have known that. I mean, C# has been my main language for the last 4 years, and I've worked with it for more years than that besides. Surely, I just failed some hypothetical job interview somewhere, with a cabal of senior developers reviewing my quiz answers and saying, "for shame, he doesn't even know the default Equals behavior for value types." I'd be laughed off of stack overflow's C# section, to be certain.

And yet, I don't really care that I don't know that (of course, now I do know the answer, but you get what I'm saying). I find myself having an attitude of "I'll figure things out when I need to know them, and hopefully I'll remember them." Pursuing encyclopedic knowledge of a language's behavior doesn't much interest me, particularly since those goalposts may move, or I may wind up coding in an entirely different language next month. But there's something deeper going on here because I don't care now, but that wasn't always true -- I used to.

The Scientific Method

When I began to think back on this, I think the drop off in valuing this type of knowledge correlated with my adoption of TDD. It then became obvious to me why my attitude had changed. One of the more subtle value propositions of TDD is that it basically turns your programming into an exercise in the Scientific Method with extremely rapid feedback. Think of what TDD has you doing. You look at the code and think something along the lines of, "I want it to do X, but it doesn't -- why not?" You then write a test that fails. Next, you look at the code and hypothesize about what would make it pass. You then do that (experimentation) and see if your test goes green (testing). Afterward, you conduct analysis (do other tests pass, do you want to refactor, etc).

ScientificMethod

Now you're probably thinking (and correctly) that this isn't unique to TDD. I mean, if you write no unit tests ever, you still presumably write code for a while and then fire up the application to see if it's doing what you hypothesized that it would while writing it. Same thing, right?

Well, no, I'd argue. With TDD, the feedback loop is tight and the experiments are more controlled and, more importantly, isolated. When you fire up the GUI to check things out after 10 minutes of coding, you've doubtless economized by making a number of changes. When you see a test go green in TDD, you've made only one specific, focused change. The modify and verify application behavior method has too many simultaneous variables to be scientific in approach.

Okay, fine, but what does this have to do with whether or not I value encyclopedic language knowledge? That's a question with a slightly more nuanced answer. After years of programming according to this mini-scientific method, what's happened is that I've devalued anything but "proof is in the pudding" without even realizing it. In other words, I sort of think to myself, "none of us really knows the answer until there's a green test proving it to all of us." So, my proud answer to questions like, "wouldn't it work to use the default equals method for value types" has become, "dunno for certain, let's write a test and see."

False Certainty

Why proud? Well, I'll tell you a brief story about a user group I attended a while back. The presenter was doing a demonstration on Linq, closures, and deferred execution and he made the presentation interactive. He'd show us methods that exposed subtle, lesser known behaviors of the language in this context and the (well made) point was that these things were complex and trying to get the answers right was humbling.

It's generally knowledgeable people that attend user groups and often even more knowledgeable people that brave the crowd to go out on a limb and answer questions. So, pretty smart C# experts were shouting out their answers to "what will this method return" and they were getting it completely wrong because it was hard and it required too much knowledge of too many edge cases in too short a period of time. A friend of mine said something like, "man, I don't know -- slap a unit test on it and see." And... he's absolutely right, in my opinion. We're not language authors, much less compilers and runtimes, and thus the most expedient answer to the question comes not from applying amassed language knowledge but from experimentation.

Think now of the world of programming over the last 50 years. In times where compiles and executions were extremely costly or lengthy, you needed to be quite sure that you got everything right ahead of time. And doing so required careful analysis that could only be done well with a lot of knowledge. Without prodigious knowledge of the libraries and languages you were using, you would struggle mightily. But that's really no longer true. We're living in an age of abundant hardware power and lightning fast feedback where knowing where to get the answers quickly and accurately is more valuable than knowing them. It's like we've been given the math textbook with the answers in the back and the only thing that matters is coming up with the answers. Yeah, it's great that you're enough of a hotshot to get 95% of the answers right by hand, but guess what -- I can get 100% of them right and much, much faster than you can. And if the need to solve new problems arises, it's still entirely possible for me to work out a good way to do it by using the answer to deduce how the calculation process works.

Caveats

In the course of writing this, I can think of two valid objections/comments that people might have critiquing what I'm saying, so I'd like to address them. First of all, I'm not saying that you should write production unit tests to answer questions about how the framework/language works. Unit testing the libraries and languages that you use is an anti-pattern. I'm talking about writing tests to see how your code will behave as it uses the frameworks and languages. (Although, a written and then deleted unit test is a great, fast-feedback way to clarify language behavior to yourself.)

Secondly, I'm not devaluing knowledge of the language/framework nor am I taking pride in my ignorance of it. I didn't know how the default Equals behavior worked for value types yesterday and today I do. That's an improvement. The reason it's an improvement is that the knowledge is now stored in a more responsive cache. I maintain having the knowledge is trumped by knowing how to acquire it, and I look at reaching into my own personal memory stores as like having it in a CPU cache versus the memory of writing a quick test to see versus the disk space location of looking it up on the internet or asking a friend.

The more knowledge you have of the way the languages and frameworks you use work, the less time you'll have to sink into proving behaviors to yourself, so that's clearly a win. To continue the metaphor, what I'm saying is that there's no value or sense in going out preemptively and loading as much as you can from disk into the CPU cache so that you can show others that it's there. In our world, memory and disk lookups are just no longer expensive enough to make that desirable.