Stories about Software


Practical Math for Programmers: O Notation

It’s been a while since I’ve done one of these, and I didn’t want to let the series die, so here goes.

The “Why This Should Matter to You” Story

You’ve probably heard people you work with say thing like, “this is going to run in O of N squared, but mine is going to run in O of N, so we should use mine.” You understand that they’re talking about performance and, more specifically time, and that whatever N is, squaring it means more, and more is bad when it comes to time. From a realpolitik perspective, you’re aware that this O of N thing is something that lends programmers some cred if they know how to talk about it and from a career perspective, you understand that the Googles and Microsofts of the world seem interested in it.

But when your coworker talks about using his “O of N” solution, you don’t actually know what that means. And because of this, you tend to lose arguments. But beyond that, you’re also losing out on the ability to understand and reason about your code in a way that will let you see more pieces of the puzzle when weighing design tradeoffs. You could have a more intuitive understanding of how your code will perform in real world circumstances than you currently do.

Math Background

If you look at this method, do you know how long it takes to run? My guess is that you’d correctly and astutely say, “it depends on what you pass in for n.” In this case, assuming that the method called inside the loop does what’s advertised, the method will take n seconds: pass in 10 and it takes 10 seconds, but pass in 100 and it takes 100 seconds. The runtime varies in linearly with n. This said to run in O(N).

Let’s look at another example.

This one may be a little less intuitive, but you’ll definitely notice it’s going to be slower. How much? Well, it varies. If you pass in 2 for n, the outer loop will execute twice, and each time it does, the inner loop will execute twice, for a total of 2*2 = 4 executions. If you pass in 10, the outer loop will execute 10 times, with the inner loop executing ten times for each of those, so the total becomes 10*10 = 100 executions of the method. For each n, the value is n^2, so this is said to be O(n^2) runtime.

Let’s take a look at yet another example.

In this (poorly written because of the unused parameter) method, it doesn’t matter what you pass in for n. The algorithm will always execute in two seconds. So, this means that it’s O(2), right? Well, no, actually. As it turns out, a constant time algorithm is denoted in O notation by O(1). The reason for this is mainly convention, but it underscores an important point. A “constant time” algorithm may not be constant, per se, and it may not be one or any other specific number, but a “constant time” operation is one that is executed in an amount of time that is bounded by something independent of the problem at hand.

To understand what I mean here, consider the first two examples. Each of those runtimes depended on the input “n”. The third example’s runtime depends on something (the method called, for instance), but not n. Simplifying the runtime isn’t unique to constant time. Consider the following:

This code is identical to the second example, except that instead of innerIndex varying up to n, it varies only up to index. This is going to run in half the time that example two ran in, but we don’t say that it’s O(n^2/2). We simply say this is also O(n^2). Generally speaking, we’re only interested in the factor of n and not any coefficients of n. So we don’t talk about O(2n), O(n/12) or anything else, just n. Same goes for when n is squared or cubed, and the same thing goes for constant time — there’s no O(1/2) or O(26), just O(1).

One last thing to note about O notation (or Big O Notation, more formally) is that it constitutes a complexity upper bound. So, for example, if I stated that all of the examples were O(n^3), I’d technically be accurate, since all of them will run in O(n^3) or better. However, if you cite a higher order of complexity when discussing with others and then pedantically point out that you’re technically right, they’ll most likely not grasp what you’re talking about (and, if they do, they’ll probably want to stop talking to you quickly).

How It Helps You

In the same way that knowing about design patterns in software help you quickly understand a complex technique, O notation helps you understand and discuss how complex and resource intensive a solution is a function of its inputs. You can use this form of notation to talk about both algorithm runtime as well as memory consumed, and it gives you concrete methods of comparison, even in cases where you don’t know the size of the input.

For instance, if you’re doing some test runs on small set of data before pushing live to a web server and you have no way of simulating the high volume of users that it will have in the wild, this comes in very handy. Simple time trials of what you’re doing aren’t going to cut it because they’re going to be on way too small a scale. You’re going to have to reason about how your algorithm will behave, and O notation helps you do just that. On a small sample size, O(n^3) may be fine, but in production, it may grind your site to a halt. Better to know that going in.

O notation can also help you avoid writing terrible algorithms or solutions that perform wildly worse than others. For instance, consider the case of Fibonacci numbers. These are a sequence of numbers where Nth number is the last two numbers in the sequence added together: Fn = Fn-1 + Fn-2. The sequence itself is thus: 1, 1, 2, 3, 5, 8, 13, 21, etc.

Here is an elegant-seeming and crystal clear implementation of Fibonacci:

It certainly demonstrates recursion and it’s very understandable. But it also turns out to run in O(1.6^n), which is exponential runtime. We haven’t covered that one yet, but exponential runtime is catastrophic in your algorithms. Ask this thing for the 1000th Fibonacci number and come back at the end of time when it’s done completing your request. As it turns out, there is an iterative solution that runs in linear, O(n) time. You probably want that one.

This is the value of understanding O notation. Sure, without it you get that not all implementations are equally well performing and you may understand what makes some better than others. But understanding O notation gives you an easy way to remember and keep track of different approaches and how efficient they are and to communicate this information with others.

Further Reading

  1. Wikipedia
  2. A nice article for beginners
  3. Thorough explanation from stack overflow ‘wiki’
  4. Academic but approachable explanation from MIT