Understanding the formal definition of Big-O
This is the third in a three post series. The first post explains Big-O from a self-taught programmer's perspective. The second post talks about how to calculate Big-O.
In my original Big-O piece, I’ve made a bunch of simplifying assumptions and told you a bunch of half truths. It might seem a bit disingenuous, but really.. these half-truths are helpful because (unless you’re going after a formal CS degree) they don’t really matter.
But for those who are looking to get a formal CS degree, these sorts of differences matter to professors and the like. For that, let’s actually parse what the formal definition means.
The formal definition via wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinity
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and real number x0 such that |f(x)| <= M * |g(x)| for all x >= x0.
Ugh. Let’s parse that.
We have two undefined functions called f and g. Our f function is the thing we’re trying to find the Big-O of. It’s a block of code.
The g function is an attempt to classify our f
function as O(n)
(or
some other O
, but we’ll use O(n)
for example purposes).
Following this definition, we’ll know that our f function is O(n)
if
and only if we can graph the two of these and (starting at an x value
of your choosing) the lines never cross one another.
You get to choose the x value to start at in order to account for the
step in calculating Big-O where we drop all of the constants. No one
cares if the Big-O is “~O(45n)~”. We drop that because it’s the same
shape of a curve, though a bit steeper. Because the definition above
says we can choose our own constant (M
), we could just choose 46. 46n
will always be above 45n and never cross once n=1.
The same can’t be true for smaller Big-O functions. No matter what
real-number (e.g. not infinity) constant we choose to multiply log(n)
by, O(n)
will always cross it at some point.
As always, graphing makes this a bit clearer. Here is an example of our 45n vs 46n example above.
You’ll see that once n=1, they never cross each other again. If we try
the 1,000,000 * log(n)
function versus a simple O(n)
, we’ll see they
cross each other around the x=16,000,000 mark.
Because the red line is crossed, it means the green line doesn’t “bound” the red line (i.e. it is not a boundary).
You might notice that this definition means that the O(45n)
function
is also O(n)
because it will also never cross. This is true, but not
particularly useful.