Understanding the formal definition of Big-O

17 Apr 2018

This is the third in a three post series. The first post explains Big-O from a self-taught programmer's perspective. The second post talks about how to calculate Big-O.

In my original Big-O piece, I’ve made a bunch of simplifying assumptions and told you a bunch of half truths. It might seem a bit disingenuous, but really.. these half-truths are helpful because (unless you’re going after a formal CS degree) they don’t really matter.

But for those who are looking to get a formal CS degree, these sorts of differences matter to professors and the like. For that, let’s actually parse what the formal definition means.

The formal definition via wikipedia:

Let f and g be two functions defined on some subset of the real numbers. One writes

f(x) = O(g(x)) as x -> infinity

if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and real number x0 such that |f(x)| <= M * |g(x)| for all x >= x0.

Ugh. Let’s parse that.

We have two undefined functions called f and g. Our f function is the thing we’re trying to find the Big-O of. It’s a block of code.

The g function is an attempt to classify our f function as O(n) (or some other O, but we’ll use O(n) for example purposes).

Following this definition, we’ll know that our f function is O(n) if and only if we can graph the two of these and (starting at an x value of your choosing) the lines never cross one another.

You get to choose the x value to start at in order to account for the step in calculating Big-O where we drop all of the constants. No one cares if the Big-O is “~O(45n)~”. We drop that because it’s the same shape of a curve, though a bit steeper. Because the definition above says we can choose our own constant (M), we could just choose 46. 46n will always be above 45n and never cross once n=1.

The same can’t be true for smaller Big-O functions. No matter what real-number (e.g. not infinity) constant we choose to multiply log(n) by, O(n) will always cross it at some point.

As always, graphing makes this a bit clearer. Here is an example of our 45n vs 46n example above.

You’ll see that once n=1, they never cross each other again. If we try the 1,000,000 * log(n) function versus a simple O(n), we’ll see they cross each other around the x=16,000,000 mark.

Because the red line is crossed, it means the green line doesn’t “bound” the red line (i.e. it is not a boundary).

You might notice that this definition means that the O(45n) function is also O(n) because it will also never cross. This is true, but not particularly useful.