DaedTech

Stories about Software

By

TDD For Breaking Problems Apart

If I have to write some code that’s going to do something rather complex, it’s easy for my mind to start swimming with possibilities, as mentioned in the magic boxes post. In that post, I talked about thinking in abstractions and breaking problems apart and alluded to TDD being a natural fit for this paradigm without going into much detail. Today and in my next post about this I’m going to go into that detail.

Let’s say I’m tasked with coding some class or classes that keep track of bowling scores. I don’t think “okay, what is my score at the start of a bowling game?” I think things like “I’ll need to have some iteration logic that remembers back at least two frames, since that’s how far back we can go, and I’ll probably need something to look ahead to handle the 10th frame, and maybe there’s a design pattern for that, and…” Like I said, swimming. These are the insane ramblings of an over-caffeinated maniac more than they’re a calm and methodical solution to a somewhat, but not terribly complex problem. In a way, this is like premature optimization. I’m solving problems that I anticipate having rather than problems that I actually have right at this moment.

TDD is thus like a deep breath, a relaxing cup of some hot beverage, and perhaps a bit of meditation or zoning out. All of this noise fades out of my head and I can actually start figuring out how things work. The reason for this is that the essence of TDD is breaking the task into small, manageable problems and solving those while simultaneously ensuring that they stay solved. It’s like a check-list. Forget all of this crap about iterators and design patterns — let’s establish a base case: “Score at the beginning of a game is zero.” That’s pretty easy to solve and pretty hard to get frantic over.

With TDD, the first that I’m going to do is write a test and I’m going to write only enough of a test to fail (which includes writing something that doesn’t compile. So, let’s do that:

[TestClass]
public class Constructor
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void Initializes_Score_To_Zero()
    {
        var scoreCalculator = new BowlingScoreCalculator();
    }
}

Alright, since there is no such type as “BowlingScoreCalculator”, this test fails by virtue of non-compiling. I then define the type and the test goes green (I’m using NCrunch, so I’m never actually building or running anything — I just see dots change colors). Next, I want to assert that the score property is equal to zero after the constructor executes:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }
}

public class BowlingScoreCalculator
{
    public BowlingScoreCalculator()
    {

    }
}

This also fails, since that property doesn’t exist, and I solve this problem by declaring an auto-implemented property, which actually makes the whole test pass. And just like that, I’ve notched my first passing test and step toward a functional bowling calculator. The game starts with a score of zero.

What’s next? Well, I suppose the simplest thing that we should say is that if I bowl a frame with a score of zero for the first throw and one on the second throw, my score is now 1. Okay, so what is a frame, and what kind of data structure should I use to represent it, and what would be the ideal API for that, and is score really best represented as a property, and — BZZT!! Stop it. We’re solving small, simple problems one at a time. And the only problem I have at the moment is “lack of failing test”. I’m going to quickly decide that the mechanism for bowling a frame is going to be BowlFrame(int, int) so that I know the name of my next sub-test class for naming purposes. Writing code until something fails, I get:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }

    [TestClass]
    public class BowlFrame
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_0_And_1_Results_In_Score_1()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(0, 1);
        }
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public BowlingScoreCalculator()
    {    

    }
}

This fails because there is no “bowl frame” method, so I add it. CodeRush’s “declare method” refactoring does the trick nicely, populating the new method with a thrown NotImplementedException, which gives me a red test instead of a non-compile. I have to make the test pass before moving on, so I delete it by deleting the throw, and then I add an assert that score is equal to 1. This fails, and I make it pass:

[TestClass]
public class Constructor
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void Initializes_Score_To_Zero()
    {
        var scoreCalculator = new BowlingScoreCalculator();

        Assert.AreEqual(0, scoreCalculator.Score);
    }
}

[TestClass]
public class BowlFrame
{
    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_0_And_1_Results_In_Score_1()
    {
        var scoreCalculator = new BowlingScoreCalculator();
        scoreCalculator.BowlFrame(0, 1);

        Assert.AreEqual(1, scoreCalculator.Score);
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public BowlingScoreCalculator()
    {

    }

    public void BowlFrame(int throw1, int throw2)
    {
        Score = 1;
    }
}

Now, with a green test, I have the option to solve problems in existing code (i.e. “refactor”). For instance, I don’t like that I have a useless, empty constructor, so I delete it and note that my tests still pass. This is another easy problem that I can solve with confidence.

From here, I think I’d like a non-obtuse implementation of scoring that first frame. I think I’ll test that if I bowl a 2 and then a 3, the score is equal to five. For the first time, I’m not going to have any non-compiling failings, so I write some code and see red, obviously, which I get rid of by making the whole thing look like this:

[TestClass]
public class BowlingTest
{
    [TestClass]
    public class Constructor
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void Initializes_Score_To_Zero()
        {
            var scoreCalculator = new BowlingScoreCalculator();

            Assert.AreEqual(0, scoreCalculator.Score);
        }
    }

    [TestClass]
    public class BowlFrame
    {
        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_0_And_1_Results_In_Score_1()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(0, 1);

            Assert.AreEqual(1, scoreCalculator.Score);
        }

        [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
        public void With_Throws_2_And_3_Results_In_Score_5()
        {
            var scoreCalculator = new BowlingScoreCalculator();
            scoreCalculator.BowlFrame(2, 3);

            Assert.AreEqual(5, scoreCalculator.Score);
        }
    }
}

public class BowlingScoreCalculator
{
    public int Score { get; set; }

    public void BowlFrame(int throw1, int throw2)
    {
        Score = throw1 + throw2;
    }
}

All right, the calculator is now doing a good job of handling a low scoring first frame. So, let’s correct a little thing or two with the refactor cycle, making sure that whatever we do is a problem that’s easily solvable, isolated, and small in scope. I’m thinking I don’t like the magic numbers 2, 3 and 5 in that last test. Let’s try making it look like this:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void With_Throws_2_And_3_Results_In_Score_5()
{
    var scoreCalculator = new BowlingScoreCalculator();
    const int firstThrow = 2;
    const int secondThrow = 3;
    scoreCalculator.BowlFrame(firstThrow, secondThrow);

    Assert.AreEqual(firstThrow + secondThrow, scoreCalculator.Score);
}

Yes, it’s perfectly acceptable to refactor your unit tests during the “refactor” phase. Treat these guys as first class code and keep them clean or nobody will want them around. I have another bone to pick with this code, which is the duplication of the constructor logic in each test. Let’s fix that too:

[TestClass]
public class BowlFrame
{
    private static BowlingScoreCalculator Target { get; set; }

    [TestInitialize()]
    public static void BeforeEachTest()
    {
        Target = new BowlingScoreCalculator();
    }

    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_0_And_1_Results_In_Score_1()
    {
        Target.BowlFrame(0, 1);

        Assert.AreEqual(1, Target.Score);
    }

    [TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
    public void With_Throws_2_And_3_Results_In_Score_5()
    {
        const int firstThrow = 2;
        const int secondThrow = 3;
        Target.BowlFrame(firstThrow, secondThrow);

        Assert.AreEqual(firstThrow + secondThrow, Target.Score);
    }
}

There, that looks better to me now. Now, what’s wrong with the production code? What problem can we solve? I can think of a few things: we can bowl more than 10 per frame, we can bowl negative numbers, and this thing will only keep track of the most recent frame. There are many other issues as well, but we’re already trending toward overload here, so let’s get back on Easy Street and figure out what to do if the user gives us goofy input.

I write a test that fails:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Throws_Exception_On_Negative_Argument()
{
    ExtendedAssert.Throws(() => Target.BowlFrame(-1, 0));
}

And then make it pass (first by declaring the nested exception and then by throwing it when the first frame is negative). I also take a shortcut in terms of obtuseness of my TDD here and make it throw the exception if either parameter is negative. I consider this acceptable, personally, since I believe that within the red-green-refactor discipline it’s a matter of some discretion how simple “the simplest thing” is. I’m not going to be accepting negative parameters for one and not the other and I know it. At any rate, after getting this to pass, I now refactor my method parameter names to be more descriptive and set the Score setter to private (which I should have done at first anyway) and the production class looks like this:

public class BowlingScoreCalculator
{
    public class FrameUnderflowException : Exception { }

    public int Score { get; private set; }

    public void BowlFrame(int firstThrow, int secondThrow)
    {
        if (firstThrow < 0 || secondThrow < 0)
            throw new FrameUnderflowException();
        Score = firstThrow + secondThrow;
    }
}

Now I’m going to tackle the similar problem of allowing too much scoring for a frame. I want to solve the problem that scores of greater than 10 are currently allowed and I’m going to do this with the test case for 11. TDD is not a smoke testing approach nor is it an exhaustive approach, so I’m not going to write tests for 11, 12, 13… 200 or anything like that. As such, I try to hit edge cases whenever possible and that’s what I’ll do here:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Throws_Exception_On_Score_Of_11()
{
    ExtendedAssert.Throws(() => Target.BowlFrame(10, 1));
}

I make this pass, and then I go back and then decide I’m not a huge fan of that “10” in both my test and in the production class. Zero I can let slide because of it’s sort of universal value in scoring of competitions, but 10 deserves a descriptive name. I’m going to call it “Mark” since that’s what a frame of 10 is known as in bowling.

Now that we’ve sufficiently guarded against bad inputs, I’d say it’s time to start thinking about aggregation. The easiest way to do that is to have a frame of 1 and then another frame of one, and make sure that we have a score of 2:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Sets_Score_To_2_After_2_Frames_With_Score_Of_1_Each()
{
    Target.BowlFrame(1, 0);
    Target.BowlFrame(1, 0);

    Assert.AreEqual(2, Target.Score);
}

How cool is it that I fix this by changing “Score = firstThrow + secondThrow;” to “Score += firstThrow + secondThrow;” Talk about the simplest solution possible!

Now, I don’t like the way that test is written with 5 int literals in there. This is telling me that it might be time to rethink the way I’m handling frame. I don’t really want to think too much about it now, but I know that there’s going to be a hard-fail when I get to the 10th frame, which can have three throws so I’m already not married to this implementation. And now it’s already starting to feel awkward.

So what if I created a simple frame object? How would that look? Well, I’ll cover that next time as I flesh out this exercise. I’m hoping I’ve at least sold you somewhat on the notion that TDD forces you to chunk problems into manageable pieces and solve them that way without getting carried away. Next time I’ll circle back a bit to the idea of “magic boxes” as we decide how to divide up the concept of “frame” and “game” while adhering to the practice of TDD.

By

How to Create Good Abstractions: Oracles and Magic Boxes

Oracles In Math

When I was a freshman in college, I took a class called “Great Theoretical Ideas In Computer Science”, taught by Professor Steven Rudich. It was in this class that I began to understand how fundamentally interwoven math and computer science are and to think of programming as a logical extension of math. We learned about concepts related to Game Theory, Cryptography, Logic Gates and P, NP, NP-Hard and NP-Complete problems. It is this last set that inspires today’s post.

This is not one of my “Practical Math” posts, so I will leave a more detailed description of these problems for another time, but I was introduced to a fascinating concept here: that you can make powerful statements about solutions to problems without actually having a solution to the problem. The mechanism for this was an incredibly cool-sounding concept called “The Problem Solving Oracle”. In the world of P and NP, we could make statements like “If I had a problem solving oracle that could solve problem X, then I know I could solve problem Y in polynomial (order n squared, n cubed, etc) time.” Don’t worry if you don’t understand the timing and particulars of the oracle — that doesn’t matter here. The important concept is “I don’t know how to solve X, but if I had a machine that could solve it, then I know some things about the solutions to these other problems.”

It was a powerful concept that intrigued me at the time, but more with grand visions of fast factoring and solving other famous math problems and making some kind of name for myself in the field of computer science related mathematics. Obviously, you’re not reading the blog of Fields Medal winning mathematician, Erik Dietrich, so my reach might have exceeded my grasp there. However, concepts like this don’t really fall out of my head so much as kind of marinate and lie dormant, to be re-appropriated for some other use.

Magic Boxes: Oracles For Programmers

One of the most important things that we do as developers and especially as designers/architects is to manage and isolate complexity. This is done by a variety of means, many of which have impressive-sounding terms like “loosely coupled”, “inverted control”, and “layered architecture”. But at their core, all of these approaches and architectural concepts have a basic underlying purpose: breaking problems apart into isolated and tackle-able smaller problems. To think of this another way, good architecture is about saying “assume that everything else in the world does its job perfectly — what’s necessary for your little corner of the world to do the same?”

That is why the Single Responsibility Principle is one of the most oft-cited principles in software design. This principle, in a way, says “divide up your problems until they’re as small as they can be without being non-trivial”. But it also implies “and just assume that everyone else is handling their business.”

Consider this recent post by John Sonmez, where he discusses deconstructing code into algorithms and coordinators (things responsible for getting the algorithms their inputs, more or less). As an example, he takes a “Calculator” class where the algorithms of calculation are mixed up with the particulars of storage and separates these concerns. This separates the very independent calculation algorithm from the coordinating storage details (which are of no consequence to calculations anyway) in support of his point, but it also provides the more general service of dividing up the problem at hand into smaller segments that take one another granted.

Another way to think of this is that his “Calculator_Mockless” class has two (easily testable) dependencies that it can trust to do their jobs. Going back to my undergrad days, Calculator_Mockless has two Oracles: one that performs calculations and the other that stores stuff. How do these things do their work? Calculator_Mockless doesn’t know or care about that; it just provides useful progress and feedback under the assumption that they do. This is certainly reminiscent of the criteria for “Oracle”, an assumption of functionality that allows further reasoning. However, “Oracle” has sort of a theoretical connotation in the math sense that I don’t intend to convey, so I’m going to adopt a new term for this concept in programming: “Magic Box”. John’s Calculator_Mockless says “I have two magic boxes — one for performing calculations and one for storing histories — and given these boxes that do those things, here’s how I’m going to proceed.”

How Spaghetti Code Is Born

It’s one thing to recognize the construction of Magic Boxes in the code, but how do you go about building and using them? Or, better yet, go about thinking in terms of them from the get-go? It’s a fairly sophisticated and not always intuitive method for deconstructing programming problems.

nullTo see what I mean, think of being assigned a hypothetical task to read an XML file full of names (right), remove any entries missing information, alphabetize the list by last name, and print out “First Last” with “President” pre-pended onto the string. So, for the picture here, the first line of the output should be “President Grover Cleveland”. You’ve got your assignment, now, quick, go – start picturing the code in your head!

What did you picture? What did you say to yourself? Was it something like “Well, I’d read the file in using the XDoc API and I’d probably use an IList<> implementer instead of IEnumerable<> to store these things since that makes sorting easier, and I’d probably do a foreach loop for the person in people in the document and while I was doing that write to the list I’ve created, and then it’d probably be better to check for the name attributes in advance than taking exceptions because that’d be more efficient and speaking of efficiency, it’d probably be best to append the president as I was reading them in rather than iterating through the whole loop again afterward, but then again we have to iterate through again to write them to the console since we don’t know where in the list it will fall in the first pass, but that’s fine since it’s still linear time in the file size, and…”

And slow down there, Sparky! You’re making my head hurt. Let’s back up a minute and count how many magic boxes we’ve built so far. I’m thinking zero — what do you think? Zero too? Good. So, what did we do there instead? Well, we embarked on kind of a willy-nilly, scattershot, and most importantly, procedural approach to this problem. We thought only in terms of the basic sequence of runtime operations and thus the concepts kind of all got jumbled together. We were thinking about exception handling while thinking about console output. We were thinking about file reading while thinking about sorting strings. We were thinking about runtime optimization while we were thinking about the XDocument API. We were thinking about the problem as a monolithic mass of all of its smaller components and thus about to get started laying the bedrock for some seriously weirdly coupled spaghetti code.

Cut that Spaghetti and Hide It in a Magic Box

Instead of doing that, let’s take a deep breath and consider what mini-problems exist in this little assignment. There’s the issue of reading files to find people. There’s the issue of sorting people. There’s the issue of pre-pending text onto the names of people. There’s the issue of writing people to console output. There’s the issue of modeling the people (a series of string tuples, a series of dynamic types, a series of anonymous types, a series of structs, etc?). Notice that we’re not solving anything — just stating problems. Also notice that we’re not talking at all about exception handling, O-notation or runtime optimization. We already have some problems to solve that are actually in the direct path of solving our main problem without inventing solutions to problems that no one yet has.

So what to tackle first? Well, since every problem we’ve mentioned has the word “people” in it and the “people” problem makes no mention of anything else, we probably ought to consider defining that concept first (reasoning this way will tell you what the most abstract concepts in your code base are — the ones that other things depend on while depending on nothing themselves). Let’s do that (TDD process and artifacts that I would normally use elided):

public struct Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public override string ToString()
    {
        return string.Format("{0} {1}", FirstName, LastName);
    }
}

Well, that was pretty easy. So, what’s next? Well, the context of our mini-application involves getting people objects, doing things to them, and then outputting them. Chronologically, it would make the most sense to figure out how to do the file reads at this point. But “chronologically” is another word for “procedurally” and we want to build abstractions that we assemble like building blocks rather than steps in a recipe. Another, perhaps more advantageous way would be to tackle the next simplest task or to let the business decide which functionality should come next (please note, I’m not saying that there would be anything wrong with implementing the file I/O at this point — only that the rationale “it’s what happens first sequentially in the running code” is not a good rationale).

Let’s go with the more “agile” approach and assume that the users want a stripped down, minimal set of functionality as quickly as possible. This means that you’re going to create a full skeleton of the application and do your best to avoid throw-away code. The thing of paramount importance is having something to deliver to your users, so you’re going to want to focus mainly on displaying something to the user’s medium of choice: the console. So what to do? Well, imagine that you have a magic box that generates people for you and another one that somehow gets you those people. What would you do with them? How ’bout this:

public class PersonConsoleWriter
{
    public void WritePeople(IEnumerable people)
    {
        foreach (var person in people)
            Console.Write(person);
    }
}

Where does that enumeration come from? Well, that’s a problem to deal with once we’ve handled the console writing problem, which is our top priority. If we had a demo in 30 seconds, we could simply write a main that instantiated a “PersonConsoleWriter” and fed it some dummy data. Here’s a dead simple application that prints, well, nothing, but it’s functional from an output perspective:

public class PersonProcessor
{
    public static void Main(string[] args)
    {
        var writer = new PersonConsoleWriter();
        writer.WritePeople(Enumerable.Empty());
    }
}

What to do next? Well, we probably want to give this thing some actual people to shove through the pipeline or our customers won’t be very impressed. We could new up some people inline, but that’s not very impressive or helpful. Instead, let’s assume that we have a magic box that will fetch people out of the ether for us. Where do they come from? Who knows, and who cares — not main’s problem. Take a look:

public class PersonStorageReader
{
    public IList GetPeople()
    {
        throw new NotImplementedException();
    }
}

Alright — that’s pretty magic box-ish. The only way it could be more so is if we just defined an interface, but that’s overkill for the moment and I’m following YAGNI. We can add an interface if thing later needs to pull customers out of a web service or something. At this point, if we had to do a demo, we could simply return the empty enumeration instead of throwing an exception or we could dummy up some people for the demo here. And the important thing to note is that now the thing that’s supposed to be generating people is the thing that’s generating the people — we just have to sort out the correct internal implementation later. Let’s take a look at main:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople();

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

Well, that’s starting to look pretty good. We get people from the people reader and feed them to the people console writer. At this point, it becomes pretty trivial to add sorting:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople().OrderBy(p => p.LastName);

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

But, if we were so inclined, we could easily look at main and say “I want a magic box that I hand a person collection to and it gives me back a sorted person collection, and that could be stubbed out as follows:

public static void Main(string[] args)
{
    var reader = new PersonStorageReader();
    var people = reader.GetPeople();

    var sorter = new PersonSorter();
    var sortedPeople = sorter.SortList(people);

    var writer = new PersonConsoleWriter();
    writer.WritePeople(people);
}

The same kind of logic could also be applied for the “pre-pend the word ‘President'” requirement. That could pretty trivially go into the console writer or you could abstract it out. So, what about the file logic? I’m not going to bother with it in this post, and do you know why? You’ve created enough magic boxes here — decoupled the program enough — that it practically writes itself. You use an XDocument to pop all Person nodes and read their attributes into First and Last name, skipping any nodes that don’t have both. With Linq to XML, how many lines of code do you think you need? Even without shortcuts, the jump from our stubbed implementation to the real one is small. And that’s the power of this approach — deconstruct problems using magic boxes until they’re all pretty simple.

Also notice another interesting benefit which is that the problems of runtime optimization and exception handling now become easier to sort out. The exception handling and expensive file read operations can be isolated to a single class and console writes, sorting, and other business processing need not be affected. You’re able to isolate difficult problems to have a smaller surface area in the code and to be handled as requirements and situation dictate rather than polluting the entire codebase with them.

Pulling Back to the General Case

I have studiously avoided discussion of how tests or TDD would factor into all of this, but if you’re at all familiar with testable code, it should be quite apparent that this methodology will result in testable components (or endpoints of the system, like file readers and console writers). There is a deep parallel between building magic boxes and writing testsable code — so much so that “build magic boxes” is great advice for how to make your code testable. The only leap from simply building boxes to writing testable classes is to remember to demand your dependencies in constructors, methods and setters, rather than instantiating them or going out and finding them. So if PersonStorageReader uses an XDocument to do its thing, pass that XDocument into its constructor.

But this concept of magic boxes runs deeper than maintainable code, TDD, or any of the other specific clean coding/design principles. It’s really about chunking problems into manageable bites. If you’re even writing code in a method and you find yourself thinking “ok, first I’ll do X, and then-” STOP! Don’t do anything yet! Instead, first ask yourself “if I had a magic box that did X so I didn’t have to worry about it here and now, what would that look like?” You don’t necessarily need to abstract every possible detail out of every possible place, but the exercise of simply considering it is beneficial. I promise. It will help you start viewing code elements as solvers of different problems and collaborators in a broader solution, instead of methodical, plodding steps in a gigantic recipe. It will also help you practice writing tighter, more discoverable and usable APIs since your first conception of them would be “what would I most like to be a client of right here?”

So do the lazy and idealistic thing – imagine that you have a magic box that will solve all of your problems for you. The leap from “magic box” to “collaborating interface” to “implemented functionality” is much simpler and faster than it seems when you’re isolating your problems. The alternative is to have a system that is one gigantic, procedural “magic box” of spaghetti and the “magic” part is that it works at all.

By

ASP Webforms Validation 101

Today I’m going to delve into a topic I don’t know a ton about in the hopes that someone who knows less than me will stumble onto it and find it helpful. As I’ve alluded to here and there, I’ve spent the last couple of months doing ASP Webforms development, which is something I’d never done before. I’ve picked up a handful of tips and tricks along the way. Today I’m going to walk through the basics of validation.

Start out by creating a new ASP.NET Webforms project:

This gives you an incredibly simple Webforms project that we can use for example purposes. If you open up the login form, you’ll find a boilerplate login form laid out in markup. Let’s take a look at the User Name controls:

  • User name
  • What we’ve got here is the most basic form control validator, the required field validator. The “ControlToValidate” attribute tells us that it’s going to validate the “UserName” text box and the “ErrorMessage” attribute contains what will be displayed if the validation fails (i.e. there is nothing in the field when you submit the form):

    Pretty standard stuff. Let’s switch things up a little, though. Let’s add another label after the text box with something goofy like “is logging in.” This is something I bumped into early on as a layout issue and had to poke and google around for. Here’s the new look for the markup:

  • User name is logging in.
  • And here’s what it looks like:

    But that’s no good. We want this on the same line as the text box itself or it looks even goofier. Well, counterintuitive as it seemed to me, the validator field defaults to taking up the space that it would occupy if it were always visible. You can alter this behavior, however, by adding the attribute Display=”Dynamic” to the validator tag. Once you do this, the new label will appear on the same line–unless you mess up. Then the validator will resume taking up space, bumping the new label to the next line. Okay, okay, I’m no UX guru, but the important thing here is that you can set the space occupation behavior of your validators.

    The next lesson I learned was that I could use validators for comparing values as well as doing typechecks. This tripped me up a bit too because I would have assumed that there was some kind of TypeCheckValidator, but this isn’t the case. Instead, you have to use CompareValidator. Let’s say that we want users to have to log in with a decimal representing a valid currency. (“Why,” you ask? Well, because we’re insane 🙂 .) This is what it would look like:

  • User name is logging in.
  • The new validator shares some commonality with the existing one, but take special note of the “Operator” and “Type” attributes. Both of these fields are necessary. For anyone who has read my various rants about abstractions, would you care to guess why I found this completely unintuitive? Well, I don’t know about you, but I personally don’t tend to think of “DataTypeCheck” as an “Operator” (perhaps an “operation,” but even that seems like a stretch). I would have expected either a DataType validator or else the Compare validator simply to need the type specified, at which time it would do a type check. But, I digress.

    The next sticking point that I encountered was that I had a particular form where I wanted to validate something that wasn’t part of a text box. I thought I was dead in the water and would have to do something sort of kludgy, but CustomValidator was exactly what I needed. Let’s say that I wanted to verify that the weird label I’d created does not, in fact, contain the word “is.” (This isn’t necessarily as silly as it sounds if there’s code that alters this label dynamically based on other inputs.) If I point any validator at this control as the “ControlToValidate”, I’ll get an exception saying that it cannot be validated. But I can omit that property and specify an event handler.

    Add the custom validator to your markup and you get this:

  • User name is logging in.
  • And add the following to your code behind:

    public void PointlessLabelValidator_ServerValidate(object sender, ServerValidateEventArgs e)
    {
        var labelText =((Label)LoginControl.FindControl("PointlessVerb")).Text;
        e.IsValid = !labelText.Contains("is");
    }
    

    Now launch the web app again and type a number for the user name and something for the password and observe the new error message. It’s important to note here that you need to satisfy the other validation constraints because the custom validator operates a little differently. The validators we’ve added up until now work their magic by sending validation java script over the wire to the client and so validation is client-side and immediate. Here, we’re performing a server-side validation. This server-side custom validation will be short-circuited by client-side failures, which is why you need to fix those before seeing the new one.

    And that’s my brief primer on validators. This is neither exhaustive nor the equivalent of a nice book written by a Webforms guru, but hopefully if you’re here it’s helped you figure out a few basics of Webforms validation.

    By

    FluentSqlGenerator

    A while back, I made a post about using string.Join() to construct SQL where clauses from collections of individual clauses. In that post, I alluded to playing with a “more sophisticated” where clause builder. I did just that here and there and decided to post the results to github. You can find at here if you want to check it out.

    My implementation makes use of the Composite design pattern and the idea of well formed formula semantics in formal systems such as propositional and first order logic. The latter probably sounds a little stuffy and exposes my inner math geek, but that’s just a rigorous way of expressing the concept of building a statement from literals and basic operations on those literals. To pull it back yet another level of stuffiness, consider simple arithmetic, in which all of the following are valid expressions: “6”, “12 + 6”, “12 + (9 – 3)”. The first expression is an atomic literal and the second expression a binary operation. The third expression is interesting in that it shows that functions can have arguments that are literals or other expressions (if this still seems strange, think of these examples as “6”, “Add(12, 6)” and “Add(12, Subtract(9, 3))”.

    Think of how this applies to the propositional semantics that make up SQL query where clauses. I can have “Column1 = 12” or I can have “Column1 = 12 AND Column2 = 13” or I can have “Column1 = 12 AND (Column2 = 13 OR Column3 = 4)”. When I want to model this concept in an object oriented sense, I need to represent the operators “AND” and “OR” as objects with two properties: left expression and right expression. I also need it to be possible that either of these properties is a “literal” of the form “col = val” or that it could be another expression as with the last example. Composite is thus a natural fit when you consider that these clauses are really expression trees, in a very real sense. So there is a “Component” base that’s abstract and then “Clause” and “Operation” objects that inherit from them and are fungible when constructing expressions.

    This was the core of the implementation, but I also dressed it up a bit with some extension methods to support a discoverable, fluent interface, but optionally (I’m still very leery of this construct, but this seems like an appropriate and judicious use). Another nice feature, in my opinion is that it supports generic parameters so you don’t have massive overloads — you can set your columns equal to objects, strings, decimals, ints, etc. It makes heavy use of ToString() with these generic parameters, so use any type you please so long as what you want out of it is well represented by ToString().

    A sample API is as follows:

    var clause = Column.Named("Column1").IsEqualTo(123);
    Console.WriteLine(clause);
    
    clause = Column.Named("Column1").IsEqualTo(123).And(Column.Named("Column2").IsEqualTo("123456"));
    Console.WriteLine(clause);
    
    clause = Column.Named("Column1").IsOneOf(1, 2, 3).Or(Column.Named("Column2").IsGreaterThan(12));
    Console.WriteLine(clause);
    
    clause = Are.AnyOfTheseTrue(Column.Named("Column1").IsEqualTo(832), Column.Named("Column2").IsLessThan(25.30));
    Console.WriteLine(clause);
    
    clause = Are.AreAllOfTheseTrue(Column.Named("Column1").IsEqualTo(832), Column.Named("Column2").IsLessThan(25.30), Column.Named("Column3").IsOneOf("Current", "Valid"));
    Console.WriteLine(clause);
    
    Console.ReadLine();
    

    Currently supported SQL operations include various comparison (equal, not equal, greater, less, etc) as well as “like” and “in()”. Expression operators “AND”, “OR” and “NOT” are supported. The utility is well covered by unit tests and a handful of integration tests too if you want to poke around but preserve functionality.

    Feel free to download, use, fork, enhance, make fun of, etc, whatever. I’m not pretending this is a problem never before solved nor that this is the most elegant solution imaginable, but it was fun to write, code-kata style, and if someone can get some use out of it, great. If I wind up making significant modifications to it or extending it, I’ll post updates here as well as checking changes into github.

    By

    Casting is a Polymorphism Fail

    Have you ever seen code that looked like the snippet here?

    public class Menagerie
    {
        private List _animals = new List();
    
        public void AddAnimal(Animal animal)
        {
            _animals.Add(animal);
        }
    
        public void MakeNoise()
        {
            foreach (var animal in _animals)
            {
                if (animal is Cat)
                    ((Cat)animal).Meow();
                else if (animal is Dog)
                    ((Dog)animal).Bark();
            }
        }
    }
    

    You probably have seen code like this, and I hope that it makes you sad. I know it makes me sad. It makes me sad because it’s clearly the result of a fundamental failure to understand (or at least implement) polymorphism. Code written like this follows an inheritance structure, but it completely misses the point of that structure, which is the ability to do this instead:

    public class Menagerie
    {
        private List _animals = new List();
    
        public void AddAnimal(Animal animal)
        {
            _animals.Add(animal);
        }
    
        public void MakeNoise()
        {
            foreach (var animal in _animals)
                animal.MakeNoise();
        }
    }
    

    What’s so great about this? Well, consider what happens if I want to add “Bird” or “Bear” to the mix. In the first example with casting, I have to add a class for my new animal, and then I have to crack open the menagerie class and add code to the MakeNoise() method that figures out how to tell my new animal to make noise. In the second example, I simply have to add the class and override the base class’s MakeNoise() method and Menagerie will ‘magically’ work without any source code changes. This is a powerful step toward the open/closed principle and the real spirit of polymorphism — the ability to add functionality to a system with a minimum amount of upheaval.

    But what about more subtle instances of casting? Take the iconic:

    public void HandleButtonClicked(object sender, EventArgs e)
    {
        var button = (Button)sender;
        button.Content = "I was clicked!";
    }
    

    Is this a polymorphism failure? It can’t be, can it? I mean, this is the pattern for event subscription/handling laid out by Microsoft in the C# programming guide. Surely those guys know what they’re doing.

    As a matter of fact, I firmly believe that they do know what they’re doing, but I also believe that this pattern was conceived of and created many moons ago, before the language had some of the constructs that it currently does (like generics and various frameworks) and followed some of the patterns that it currently does. I can’t claim with any authority that the designers of this pattern would ask for a mulligan knowing what they do now, but I can say that patterns like this, especially ones that become near-universal conventions, tend to build up quite a head of steam. That is to say, if we suddenly started writing even handlers with strongly typed senders, a lot of event producing code simply wouldn’t work with what we were doing.

    So I contend that it is a polymorphism failure and that casting, in general, should be avoided as much as possible. However, I feel odd going against a Microsoft standard in a language designed by Microsoft. Let’s bring in an expert on the matter. Eric Lippert, principal developer on the C# compiler team, had this to say in a stack overflow post:

    Both kinds of casts are red flags. The first kind of cast raises the question “why exactly is it that the developer knows something that the compiler doesn’t?” If you are in that situation then the better thing to do is usually to change the program so that the compiler does have a handle on reality. Then you don’t need the cast; the analysis is done at compile time.

    The “first kind” of cast he’s referring to is one he defines earlier in his post as one where the developer “[knows] the runtime type of this expression but the compiler does not know it.” That is the kind that I’m discussing here, which is why I chose that specific portion of his post. In our case, the developer knows that “sender” is a button but the compiler does not know that. Eric’s point, and one with which I wholeheartedly agree, is “why doesn’t the compiler know it and why don’t we do our best to make that happen?” It just seems like a bad idea to run a reality deficit between yourself and the compiler as you go. I mean, I know that the sender is a button. You know the sender is a button. The method knows the sender is a button (if we take its name, containing “ButtonClicked” at face value). Maintainers know the sender is a button. Why does everyone know sender is a button except for the compiler, who has to be explicitly and awkwardly informed in spite of being the most knowledgeable and important party in this whole situation?

    But I roll all of this into a broader point about a polymorphic approach in general. If we think of types as hierarchical (inheritance) or composed (interface implementation), then there’s some exact type that suits my needs. There may be more than one, but there will be a best one. When writing a method and accepting parameters, I should accept as general a type as possible without needing to cast so that I can be of the most service. When returning something, I should be as specific as possible to give clients the most options. But when I talk about “possible” I’m talking about not casting.

    If I start casting, I introduce error possibilities, but I also necessarily introduce a situation where I’m treating an object as two different things in the same scope. This isn’t just jarring from a readability perspective — it’s a maintenance problem. Polymorphism allows me to care only about some public interface specification and not implementation details — as long as the thing I get has the public API I need, I don’t really care about any details. But as soon as I have to understand enough about an object to understand that it’s actually a different object masquerading as the one I want, polymorphism is right out the window and I suddenly depend on knowing the intricate relationship details of the class in question. Now I break not only if my direct collaborators change, but also if some inheritance hierarchy or interface hierarchy I’m not even aware of changes.

    The reason I’m posting all of this isn’t to suggest that casting should never happen. Clearly sometimes it’s necessary, particularly if it’s forced on you by some API or framework. My hope though is that you’ll look at it with more suspicion — as a “red flag”, in the words of Eric Lippert. Are you casting because it’s forced on you by external factors, or are you casting to communicate with the compiler? Because if it’s the latter, there are other, better ways to achieve the desired effect that will leave your code more elegant, understandable, and maintainable.

    By the way, if you liked this post and you're new here, check out this page as a good place to start for more content that you might enjoy.