# Understanding the formal definition of Big-O

This is the third in a three post series. The first post explains Big-O from a self-taught programmer's perspective. The second post talks about how to calculate Big-O.

In my original Big-O piece, I’ve made a bunch of simplifying assumptions and told you a bunch of half truths. It might seem a bit disingenuous, but really.. these half-truths are helpful because (unless you’re going after a formal CS degree) they don’t really matter.

But for those who are looking to get a formal CS degree, these sorts of differences matter to professors and the like. For that, let’s actually parse what the formal definition means.

The formal definition via wikipedia:

Let f and g be two functions defined on some subset of the real numbers. One writes

f(x) = O(g(x)) as x -> infinity

if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and real number x0 such that |f(x)| <= M * |g(x)| for all x >= x0.

Ugh. Let’s parse that.

We have two undefined functions called f and g. Our f function is the thing we’re trying to find the Big-O of. It’s a block of code.

The g function is an attempt to classify our `f`

function as `O(n)`

(or
some other `O`

, but we’ll use `O(n)`

for example purposes).

Following this definition, we’ll know that our f function is `O(n)`

if
and only if we can graph the two of these and (starting at an x value
of your choosing) the lines never cross one another.

You get to choose the x value to start at in order to account for the
step in calculating Big-O where we drop all of the constants. No one
cares if the Big-O is “~O(45n)~”. We drop that because it’s the same
shape of a curve, though a bit steeper. Because the definition above
says we can choose our own constant (`M`

), we could just choose 46. 46n
will always be above 45n and never cross once n=1.

The same can’t be true for smaller Big-O functions. No matter what
real-number (e.g. not infinity) constant we choose to multiply `log(n)`

by, `O(n)`

will always cross it at some point.

As always, graphing makes this a bit clearer. Here is an example of our 45n vs 46n example above.

set title "45n vs 46n" set xrange[0:10] set yrange[0:500] plot 45*x title "45n", 46*x title "46n"

You’ll see that once n=1, they never cross each other again. If we try
the 1,000,000 * `log(n)`

function versus a simple `O(n)`

, we’ll see they
cross each other around the x=16,000,000 mark.

set title "log n vs n" set xrange[10000000:20000000] set yrange[10000000:20000000] set label "n approx 16.5 million" at 16500000,16000000 plot 1000000 * log(x) title "O(1,000,000 * log(n))", x title "O(n)"

Because the red line is crossed, it means the green line doesn’t “bound” the red line (i.e. it is not a boundary).

You might notice that this definition means that the `O(45n)`

function
is also `O(n³)`

because it will also never cross. This is true, but not
particularly useful.