DaedTech

Stories about Software

By

The Death of the Obligatory Comment

The Ben Franklin Effect

Have you ever started looking for something, let’s say your wallet, and been fairly certain that it wasn’t in your house? You started looking in all of the most likely places and progressed with increasing glumness to less and less likely landing spots. Perhaps you eventually wound up in the crawl space that you hadn’t entered since six months prior to losing your wallet. If you’ve had an experience like that, do you recall a point at which you thought to yourself, “this is completely pointless” and yet you kept doing it anyway?

Even if you can’t recall an experience like this, I imagine that you’ve seen similar stories unfold in countless other places. In politics and government for example, can you count the number of times some matter of policy has proven to be an obvious mistake and it keeps going anyway like an unstoppable juggernaut with resolute phrases like “stay the course” uttered as sober non sequitur? In the end it all seems silly; it seems to be a way to say “we know this is a mistake and we’re going to keep doing it anyway.”

I believe that this mystifying behavior has two main sources of persistence as a mainstay in the human condition. The first is that without careful consideration, we tend toward logical fallacy and this is an example of the fallacy “appeal to consequences” (aka “wishful thinking”). In other words, “I’m going to look for my wallet in the crawlspace because I believe it’s in there since it would be a real bummer if it weren’t.” We tend to double down on our mistakes because we really wish we hadn’t made them.

The second source of this is a truly fascinating study of our motivations as humans for our actions and interactions. A blog I really like called “You Are Not So Smart” defines this as “the Ben Franklin Effect“. I highly recommend a start-to-finish read of this post, and will warn you here that my next paragraph will be a spoiler, so go read it, please.

Oh, hi there. You’re back? Cool! As you now know, the premise of the post is that Ben Franklin figured out a way to turn someone who didn’t like him (a “hater”) into an ally. He asked that person to do him a favor… and it worked. As it turns out, we tend to like people because we do them favors, rather than doing them favors because we like them. The reason for this is that we construct narratives about our motivations and preferences that rationalize our past decisions; “I must like that guy since otherwise me doing him a favor would have been stupid and since I’m obviously not stupid…” (for you rhetoric buffs, this is another logical fallacy called “begging the question”)

David McRaney puts this nicely in his post:

If you live in the Deep South you might buy a high-rise pickup and a set of truck nuts. If you live in San Francisco you might buy a Prius and a bike rack. Whatever are the easiest to obtain, loudest forms of the ideals you aspire to portray become the things you own, like bumper stickers signaling to the world you are in one group and not another. Those things then influence you to become the sort of person who owns them.

What Does this Have to Do With Code Comments?

I’m glad you asked. The answer is that for me, and I suspect for a lot of you, putting comment headers above all methods, classes, fields, etc is a doubling down on a behavior that probably makes no sense, a staying of the course, and a thing that we like doing because we did it.

When I was in college, I don’t think I ever commented my code. At that time, as I recall, we were young and naive and wore code unreadability on our sleeves as a badge of honor; anyone who could execute an entire complex looping sequence in the condition and increment statements of a for loop was a total ninja, after all. When I left school and went to the professional world, I was confronted with people who wrote various kinds of code, but one was a Linux kernel hacker that I truly respected and I noticed that he dutifully put comment blocks above all of his methods. Included in these were descriptions of what the method did, what it returned, what the arguments were, what invariants it preserved, and a journal log of edits made to it.

This was an epiphany for me. All of those guys with their non-commented, zany for loops were amateurs. Real programmers did that kind of ninja stuff and then put a nice, clear comment on top that explained to mere mortals how awesome they were. I was hooked. I wanted to be that guy and so I became a guy that wrote those kind of comments above methods and everywhere else. I didn’t do this halfway — I was the kind of guy that went the extra mile to thoroughly document everything.

Years, jobs and programming languages later, this practice persisted, long outlasting my opinion that it was a badge of honor to cram multiple execution statements into loop iterators and furrow people’s brows with inscrutable pointer arithmetic (having gravitated toward code that was clear, readable and maintainable). A funny thing started happening to me, though. I started reading blog posts like this one, where people said things like:

What makes comments so smelly in general? In some ways, they represent a violation of the DRY (Don’t Repeat Yourself) principle, espoused by the Pragmatic Programmers. You’ve written the code, now you have to write about the code.

First, where are comments indeed useful (and less smelly)? If you are writing an API, you need some level of generated documentation so that people can use your API. JavaDoc style comments do this job well because they are generated from code and have a fighting chance of staying in sync with the actual code. However, tests make much better documentation than comments. Comments always lie (maybe not now, but on a long enough timeline, all comments will become outdated). Tests can’t lie or they fail. When I’m looking at work in progress on projects, I always go to the tests first. The comments may or may not be there, but the tests define what’s now done and working.

The first few times I read or heard arguments like this, I dismissed them as the work of people too lazy to write comments. As I started reading more such posts… I simply stopped thinking about them. It was a silly argument — after all, I had spent countless hours throughout my career writing comments and I wouldn’t spend all that time doing something that might not be a good idea. A friend of mine likes to satirize this type of reasoning by saying “if the facts don’t fit the dogma, the facts must be discarded,” and while the validity of comments in code is truly a matter of opinion rather than fact, the same logic applies to my reasoning. I discarded an argument because not discarding it would have made me feel bad about prior decisions that I had made.

Nagging Doubts and Real Conclusions

In spite of my best efforts simply to ignore arguments that I didn’t like, I was plagued by cognitive dissonance on the matter. Why would otherwise rational and obviously quite intelligent people say such ridiculous things about my beloved exhaustive commenting? Determined to reconcile these divergent beliefs, I decided one day while documenting code smells on a wiki for a former employer that they didn’t really mean that comments were a code smell in general – they must only be talking about stupid comments like putting “//this is a for loop” above a for loop. Yeah, of course. That’s what they meant.

But, it wasn’t. In spite of myself, I had read and processed the arguments. I knew that comments weren’t DRY and that they represented duplication of the logic of the code. I knew that heavily commented code tended to mask poor naming choices and unintuitive abstractions. I knew that comments, even XML/Javadoc comments for public-facing APIs tended to start out helpful and accurate and then to rot over time as different people took over maintenance and didn’t think to change them along with the code. Heck, I’d experienced that one firsthand with inaccuracies and inconsistencies piling up as surely as they do in a non-normalized database.

So, eventually I was forced to see the light and admit that it was possible and even probable that my efforts had been a complete waste of time over the years. Sound harsh? Well, here’s the thing. How many of those comments that I wrote are still floating around in some code base somewhere, not deleted? And of that percentage, how many are accurate? I’m guessing the percentage is tiny — the comments are dust in the wind. And before they blew away, what are the odds that anyone cared enough to read them? What are the odds that anyone cares that “ebd changed a char* to a int* on 4/15/04”?

So, I’ve stopped viewing comments as obligatory. I write them now with a sort of “when in Rome” sentiment if others do it on the code I’m working on. After all, I have plenty of practice with it. But my real preference, in light of my focus on abstractions, is now to have the attitude that I am forbidden from writing comments. If I want to make my code usable and understandable, I have to do it with tight abstractions, self-documenting code and making bad decisions impossible. Is that a lofty goal? Sure. But I think it’s a good one. I’ve gone from viewing comments as obligatory to viewing them as indicative of small design failures, and I’m content with that. I think that all along part of me viewed commenting my methods as redundant and I’ve finally rid myself of the cognitive dissonance.

By

TDD Even with DateTime.Now

Recently, I posted my incredulity at the idea that someone would go to the effort of writing unit tests and not source control them as a matter of course. I find that claim as absurd today as I did then, but I did happen on an interesting edge case where I purposely discarded a unit test I wrote during TDD that I would otherwise have kept.

I was writing a method that would take a year in integer form and populate a drop down list with all of the years starting with that one up through the current year. In this project, I don’t have access to an isolation framework like Moles or Typemock, so I have no way of making DateTime.Now return some canned value (check out this SO post for an involved discussion of the sensitivity of unit tests involving DateTime.Now).

So, as I thought about what I wanted this method to do, and how to get there, I did something interesting. I wrote the following test:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Adds_Item_When_Passed_2012()
{
    var myFiller = new CalendarDropdownFiller(new DateTimeFormatInfo());
    var myList = new DropDownList();
    myFiller.SeedWithYearsSince(myList, 2012);

    Assert.AreEqual(1, myList.Items.Count);
}

To get this to pass, I changed the method SeedWithYearsSince() to add a random item to the list. Next test I wrote was:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Adds_Item_With_Text_2012_When_Passed_2012()
{
    var myFiller = new CalendarDropdownFiller(new DateTimeFormatInfo());
    var myList = new DropDownList();
    myFiller.SeedWithYearsSince(myList, 2012);

    Assert.AreEqual("2012", myList.Items[0].Value);
}

Now, I had to actually add “2012” in the method, but it was still pretty obtuse. To get serious, I wrote the following test:

[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Adds_Two_Items_When_Passed_2011()
{
    var myFiller = new CalendarDropdownFiller(new DateTimeFormatInfo());
    var myList = new DropDownList();
    myFiller.SeedWithYearsSince(myList, 2011);

    Assert.AreEqual(2, myList.Items.Count);
}

Now the method had to do something smart, so I wrote:

public virtual void SeedWithYearsSince(DropDownList list, int year)
{
    for (int index = year; index <= DateTime.Now.Year; index++)
        list.Items.Add(new ListItem(index.ToString()));
}

And, via TDD, I got to the gist of my method correctly. (I would later write tests that passed in a null list and a negative year and test that descriptive exceptions were thrown, but this is more or less the finished product). But now, let's think about the unit tests vis a vis source control.

Of the three tests I've written, the first two should always pass unless I get around to finishing the time machine that I started building a few years back. We might consolidate those into a single test that's a little more meaningful, perhaps by dropping the first one. We might also tease out a few more cases here to guard against regressions, say proving that calling it with 2010 adds 2010, 2011 and 2012 or something. While I don't generally feel good about checking in tests that exercise code dependent on external state (like "Now"), we can feel pretty good about these given the nature of "Now".

But that last test about 2 items when passed 2011 is only good for the remainder of 2012. When you wake up bright and early on New Year's morning and run to the office and kick off a test run, this test will fail. Clearly we don't want to check that test in, so all things being equal, we'll discard it. That's a bummer, but it's okay. The point of the unit tests written here was a design strategy -- test driven development. If we can't keep the artifacts of that because, say, we don't have access to an isolation framework or permission ot use one, it's unfortunate, but c'est la vie. We'll check in the tests that we can and call it a day.

This same reasoning applies within the context of whatever restrictions are placed on you. Say you are assigned to a legacy codebase (using the Michael Feathers definition of "legacy" as code without unit tests) and do not have rights to add a test project, for whatever reason. Well, then write them to help you work, keep them around as best you can to help for as long as you can, and discard them when you have to. If you have a test project but not Moles or Typemock, you do what we did here. If you have code that you have to use that lacks seams, contains things like singeltons/static methods or otherwise presents testability problems, take the same approach. Better to test during TDD and discard then not to test at all since you can at least guard against regressions and get fast feedback during initial development.

I've often heard people emphasize that TDD is a development methodology first and the unit tests for source control are a nice ancillary benefit. But I think the example of DateTime.Now really drives home that point. The fact that DateTime.Now (or legacy code, GUI code, threaded code, etc) is fickle and hard to test need not be a blocker from doing TDD. Clearly I think we should strive to write only meaningful tests and to keep them all around, but this isn't an all or nothing proposition. Make sure you're verifying your code first and foremost, preserve what you can, and seek to improve through increasingly decoupled code, better tooling, and more practice writing good tests.

By

The Perils of Free Time at Work

Profitable Free Time

If you’ve ever been invited to interview at Google or you simply keep up with the way it does things, you’re probably familiar with Google’s “20 percent time”. According to HowStuffWorks (of all places):

Another famous benefit of working at Google is the 20 percent time program. Google allows its employees to use up to 20 percent of their work week at Google to pursue special projects. That means for every standard work week, employees can take a full day to work on a project unrelated to their normal workload. Google claims that many of their products in Google Labs started out as pet projects in the 20 percent time program.

In other words, you can spend 4 days per week helping the company’s bottom line and one day a week chasing rainbows and implementing unicorns. The big picture philosophy is that the unbridled freedom to pursue one’s interests will actually result in profitable things for the company in the long run. This is a good example of something that will tend to keep programmers around by promoting mastery, autonomy, and purpose.

We tend to think of this perk as characteristic of progressive employer startups where you imagine beer in the fridge, air hockey tables and Playstations in the break room, and various other perks designed to make it comfortable to work 70+ hour weeks in the “work hard, play-hard” culture. But interestingly, this idea went all the way back to 3M in 1948:

3M launched the 15 percent program in 1948. If it seems radical now, think of how it played as post-war America was suiting up and going to the office, with rigid hierarchies and increasingly defined work and home roles. But it was also a logical next step. All those early years in the red taught 3M a key lesson: Innovate or die, an ethos the company has carried dutifully into the 21st century.

So, for over half a century, successful companies (or at least a narrow subset of them) have been successful by allowing their employees a portion of their paid time to do whatever they please, within reason. That seems to make a good case for this practice and for a developer who finds himself in this position to be happy.

We Interrupt this Post to Bring You A Lesson From Donald Rumsfeld

Every so often politicians or other public figures say things that actually make sense, but go down in sound byte lore as debacles. John Kerry’s “I voted for the bill before I voted against it” comes to mind, but a very unfortunately misunderstood one, in my opinion, is this one from Donald Rumsfeld:

[T]here are known knowns; there are things we know that we know. There are known unknowns; that is to say there are things that, we now know we don’t know. But there are also unknown unknowns – there are things we do not know, we don’t know.

Admittedly, this is somewhat of a rhetorical mind-twister, but when you think about what he’s saying, not only does it make sense, but it is fairly profound. For example, consider your commute to work. When you go to your car, you know what kind of car you drive. That is a “known-known”. As you get in your car to go to work, it may take you 30-50 minutes depending on traffic. The traffic is a “known-unknown” in that you know it will exist but you don’t know what it will be like and you are able to plan accordingly (i.e. you allow 50 minutes even though it may take you less). But what if, while getting in your car, someone told you that you wouldn’t be at work for about 4 hours? Maybe a fender-bender? Maybe a family member calls you with an emergency? Who knows… these are the “unknown-unknowns” — the random occurrence in life for which one simply can’t plan.

The reason that I mention this is that I’d like to introduce a similar taxonomy for free as it relates to our professional lives, and I’ll ask you to bear with me the way I think you should bear with Rummy.

Structured-Unstructured Time or Unstructured-Unstructured Time

Let’s adopt these designations for time at the office. The simplest way to talk about time is what I’ll call “structured-structured time”. This is time where your boss has told you to work on X and you are dutifully working X. Google/3M’s “20/15 percent time”, respectively, is an example of what I’ll call “structured-unstructured time.” This is time that the organization has specifically set aside to let employees pursue whims and fancies while not accountable to normal scheduling consideration. It is unstructured time with a purpose. The final type(**) of time is what I’ll call “unstructured-unstructured” time, and this is time that you spend doing nothing directly productive for the company but without the company having planned on you doing that. The most obvious example I can think of is if your boss tells you to go make 10,000 copies, the copy machine is broken, and you have no other assignments. At this point you have unstructured-unstructured time where you might do anything from taking it upon yourself to fix the copy machine to taking a nap in the break room.

Making this distinction may seem like semantic nitpicking, but I believe that it’s important. I further believe that unstructured-unstructured time chips away at morale even as the right amount of structured-unstructured makes it soar. With structured-unstructured time, the company is essentially saying “we believe in you and, in fact, we believe so much in you that we’re willing to take a 20 percent hit to your productivity on the gamble that you doing what interests you will prove to be profitable.” Having someone place that kind of confidence in you is both flattering and highly motivating. The allure of doing things that interest you combined with living up to the expectations will make you work just as hard during structured-unstructured time as you would at any other time, if not harder. I’ve never had this perk, but I can say without a doubt that this is the day I’d be most likely to put in 12 or 13 hours at the office.

Contrast this with unstructured-unstructured time and the message conveyed to you by the organization. Here the company is basically saying the opposite: “we value your work so little that we’d rather pay you to do nothing than distract someone much more important than you with figuring out what you should do.” Have you ever been a new hire and twiddled your thumbs for a week or even a month, perusing stuff on the local intranet while harried employees shuffled busily around you? Ever needed input or approval from a more senior team member and spent the whole day clicking through reddit or slashdot? How about telling a manager that you have nothing you can work on and hearing, “that’s okay — don’t worry about it — you deserve some downtime.”

The difference in how this time is perceived is plain: “we’re going to carve out time because we are excited to see just how much you can do” versus “we don’t really give a crap what you do.”

But Don’t People Like Free Time and Freedom From Pressure?

You would think that everyone would appreciate the unstructured-unstructured time when it came their way, but I believe that this largely isn’t the case. Most people start to get nervous that they’re being judged if they have a lot of this time. Many (myself included) start to invent side-projects or things that they think will help the company as ways to fill the vacuum and hopefully prove themselves, but this plan often backfires as organizations that can’t keep employees busy and challenged are unlikely to be the kind of organizations that value this “self-starter” type behavior and the projects it produces. So the natural tendency to flourish during unstructured time becomes somewhat corrupted as it is accompanied by subtle feelings of purposelessness and paranoia. About the only people immune to this effect are Dead-Sea, checked out types who are generally aware consciously or subconsciously that being productive or not doesn’t affect promotions or status anyway so they might as well catch up on their reading.

I believe there is a natural progression from starting off trying to be industrious and opportunistically create structured-unstructured time during unstructured-unstructured time to giving up and embarking on self improvement/training missions during that time to sending out resumes during that time. So I would caution you that if you’re a manager or lead, make sure that you’re not letting your team flail around without structure this way. If you think they’re out of tasks, assign them some. Make time for them. Or, if nothing else, at least tell them what google/3M tell them — we value your ingenuity, so this is actually planned (not that I’m advocating lying, but come on – you can figure out a good way to spin this). Just don’t think that you’re doing team members a favor or taking it easy on them by not giving them anything to do.

If you’re a developer with this kind of time on your hands, talk to your manager and tell him or her how you feel. Formulate a plan for how to minimize or productively use this time early on, before it gets out of hand and makes you restless for Career Builder. Other people do get busy and understandably so, and so it may require gentle prodding to get yourself out of this position.

But regardless of who you are, I advocate that you do your best to provide some structure to unstructured time, both for yourself and for those around you. Working with a purpose is something that will make just about anyone happier than the alternative.

** As an aside, both Rummy and I left out a fourth option, not closing the set. In his case, it would be the “unknown-known” and in my case, it would be “unstructured-structured time”. In both cases, this assumes utter incompetence on the part of the first person in the story – an “unknown known” would mean some obvious fact that was being missed or ignored. In the case of an organization, “unstructured-structure” would mean that team members had effectively mutinied under incompetent management and created some sort of ad-hoc structure to get work done in spite of no official direction. This is a potentially fascinating topic that I may later come back to, although it’s a little far afield for the usual subject matter of this blog.

By

SQL Queries and String.Join()

Over the last few weeks, I’ve found myself working within a framework where I’m writing a lot of SQL. Most specifically, in the code I’m writing a lot of WHERE clauses related to optional user search parameters. As a simple example, consider a search over “customer” where a user could filter by a part of a customer name or by a selectable customer type or simply list all customers. This creates a situation where I can have a where clause with 0, 1, or 2 entries in it depending on what the user wants to do.

The consequence of this that my where clause may be blank, it may have a clause or it may have two clauses with an AND joining them. The most basic (naive) way to handle this is to check for each control/clause whether the user has entered something and if so, to append “{Clause} AND ” to a string builder. Then, you snip off the last 5 characters to take care of the spurious “AND” that got appended. I think we’ve all seen this sort of thing before in some form or another.

But then, I got to thinking a bit, and realized that the problem I was facing here was really that I would have n clauses and would want n – 1 ANDs (except the special case of zero, where I would want zero ANDs). A clause is just a string and the ” AND ” is essentially a delimiter, so this is really a problem of having a collection of strings and a delimiter and wanting to jam them together with it. What I want is the opposite of String.Split().

And, as it turns out, the opposite of Split() is a static method on the string class called String.Join(), which takes an array of strings and a delimiter and does exactly what I need. In this fashion I can add clauses to an object as strings and then query the object for a well-formed WHERE clause. In its simplest incarnation, it would look like this:

public class WhereBuilder
{
    private readonly List _clauses = new List();

    public void Add(string clause)
    {
        _clauses.Add(clause);
    }

    public string GetFullWhereText()
    {
        return String.Join(" AND ", _clauses.ToArray());
    }
}

You keep track of your various sub-clauses of the where in a list, and then join them together on the fly, when requested by consumer code. If you wanted to allow OR instead of AND, that’s pretty simple to support simultaneously:

public class WhereBuilder
{
    private readonly List _clauses = new List();

    public void Add(string clause)
    {
        _clauses.Add(clause);
    }

    public string GetConjunctionClause()
    {
        return Join(" AND ");
    }

    public string GetDisjunctionClause()
    {
        return Join(" OR ");
    }

    private string Join(string separator)
    {
        return String.Join(separator, _clauses.ToArray());
    }
}

Of course, this is handy for constructing clauses that all have the same operator only, and it doesn’t do anything about getting rid of the annoyance of monotonously specifying operators in side of the various clauses, but my purpose here was to highlight the handiness of String.Join() for those who hadn’t seen it before.

Stay tuned if you’re interested in a more sophisticated where clause builder — I’ve been playing with that a bit in my spare time and will post back here with it if it gets interesting.

By

Multitasking that Actually Works – Suggestions Requested

There are a lot of ways in which you can do two or more things at once and succeed only in doing them very poorly, so I’m always in search of a way to do multiple things at once but to actually get value out of both things. As a programmer with an ever-longer commute, a tendency to work well over 40 hours per week, and a wide range of interests, my time is at a premium.

Two things that have come to be indispensable for me are listening to developer podcasts (.NET Rocks, Deep Fried Bytes, etc) while I drive and watching Pluralsight videos while I run on machines at the gym. Podcasts are made for audio consumption and I think probably invented more or less with commutes in mind, so this is kind of a no-brainer fit that infuses some learning and professional interest into otherwise dead time (although I might also start listening to fiction audio books, potentially including classic literature).

Watching Pluralsight while jogging is made possible with the advent of the nice smartphone and/or tablet, but is a bit interesting nonetheless. I find it works best on elliptical machines (pictured on the left) where I’m not bouncing much or making a ton of noise and can lay my phone sideways and facing me. I don’t think this is a workable strategy for jogging outdoors or even on a treadmill, so having a gym at my office with ellipticals is kind of essential for this.

These two productive examples of multi-tasking have inspired me to try to think of other ways to maximize my time. There are some important criteria, however. The multi-tasking must not detract significantly from either task so “catching up on sleep at work” and “watching TV while listening to the radio” don’t make the cut. Additionally, at least one task must be non-trivial, so “avoiding bad music while sleeping” also does not make the cut. And, finally, I’m not interested in tasks that depend on something being inefficient, so “catching up on my RSS reader while waiting for code to compile” is no good since what I ought to be doing is figuring out a way not to be blocked by excessive compile time (I realize one could make a philosophical argument about my commute being inefficient, but I’m asking for some non-rigorous leeway for case by case application here).

This actually isn’t trivial. Most tasks that are worth doing require the lion’s share of your attention and juggling two or more often ensures that you do a haphazard job at all of them, and my life already seems sort of hyper-optimized. I work through lunch, I don’t sleep all that much, and double up wherever I can. So, additional ways to make this happen are real gems.

Anyone have additional suggestions or things that they do to make themselves more efficient? Your feedback is definitely welcome and solicited in comment form!