## Big O(H)! Time and Space Complexity!

June 25, 2015

A fundamental concept in computer science is to be able to understand the time and space complexities of algorithms. Usually, in the second year of University or the first chapter of any introductory Data Structures and Algorithms textbook, the first major concept is to be able to recognize and interpret how "expensive" in time an algorithm executes (think milliseconds, seconds, minutes, hours, etc), or how much memory that input will cause a particular algorithm to use (think bytes, kilobytes, megabytes, etc).

Do we have an idea if the algorithm will take 1 second to run? 10 seconds? 10 days? Does the order of magnitude of time taken to execute an algorithm increase dramatically as our input increases? We call this “runtime analysis”, or more commonly, "Big-O analysis".

### What is Big-O?

Big-O runtime analysis helps us understand the magnitude typically, in time an algorithm can take to finish at
its worst case. In programming, we usually assume the "worst" and will usually consider
the largest cost when determining whether a solution is worth implementing. Keep in mind,
a Big-O number is *not an exact* measurement to determine the amount of time an algorithm
will run to completion, but rather, the relative order of magnitude in time.
We can also think of this as a *general approximation*.

This is similar to let us say, determining the amount of salary you make per year. We usually have terms such as "five figures", "six figures", "seven figures" and so on. So if I was a professional athlete with a multi-million dollar contract, I would tell others that my salary is within the "seven figure" range. This could be 10,000,000 or even 50,000,000 dollars. People will understand the amount of money I make a year. As such, a similar concept applies to Big-O analysis. We're interested in finding out whether our algorithm is going to complete in a matter of seconds, or if we should just get some lunch before expecting results after a test run. :)

This is important because it allows us to compare algorithms, and analyze which implementation is better efficiently in realm of time and space. This becomes particularly important when either of these two resources are limited — (i.e, on embedded devices, or real time software.)

### Terminology

In Big-O the determinant of time is dependent on our growth function. Below are several growth functions which are used quite common when performing algorithmic analysis:

- O(1) - "constant"
- O(log N) - "logarithmic"
- O(N) - "linear"
- O(N log N) - "super-linear"
- O(N^c) - "exponential"
- O(N!) - "factorial"

Of course keep in mind we can also have growth functions such as N^(3/2) and wacky things such as Sqrt(N). O(N^(3/2)) and O(Sqrt(N)) respectively.

A **growth function** will help us figure out that given an N, how how much time
and space would the result consume? In Big-O analysis, the N is usually the size of our input,
while the growth function here is defined as g(x), gives us the amount of time our algorithm will
need to complete its task.

Two functions are of the same function class if the largest term is within the same degree for
some arbitrary large number of N. Function classes tell us the relative growth rate of the runtime
complexity. If one function produces a result that shows a linear growth, while another
shows exponential growth, then they are in *different* function classes.

For example the following two are of the same function class:

f(n) = 1.23 * N g(n) = 141588 * N

Though, we can see that **g(n)** will give a much larger result than **f(n)**,
because their degrees are the same, both will generally output the same order of magnitude.
At some point, g(n) will dominate over f(n) in run time if both exist in our algorithm.

So let's try some examples in quantifying our run times:

N = 30

O(1) = 1

O(log 30) = 1.4771

O(30 log 30) => 4.4314

O(30^2) = 900

O(30^3) = 27,000

O(2^30) = 1,073,741,824

O(30!) = 2.6525 * 10^32

As you can see here, with a small input of 30, our growth functions give us magnitudes of time that are relatively small at first, but as the function increases to exponential, the values quickly grow at a very high rate.

We can see that when N grows larger, certain growth functions will grow VERY, VERY FAST to the point where it is uncontrollable as seen with the 2-base exponential and factorial functions.

If we were to arbitrarily say that the growth function unit was in SECONDS given N input, some of us would be very old by the time an O(2^N) algorithm completes.

Also noted here is that having an N of 1,000,000 is entirely possible, and is quite common. Lots of applications have use cases where millions of input is thrown at it. After all, we would not be programming a routine to perform an operation if our sample size is usually counted on our fingers would we? I would imagine such small inputs would be easier to just be calculated by hand!

### Finding th Big-O Complexity of an Algorithm

Now that we are familiar with the Big-O concept, how do we actually determine the growth function of an algorithm?

The Big-O complexity correlates with the amount of operations a routine runs. If we had an input size of N, and our routine processes N items, then our Big-O complexity would be O(N). Similarly, if we had an input of N, and our routine made N passes for each item in our sample, that is 0, 1, 2, ..., N, our routine would be Big-O of O(N^2).

To throw a curve ball, if we had 2 data sets and wanted to enumerate through one data set of M items, and for each item make a comparison with all the items in the data set with N items, what growth function would we have? That would be O(M*N).

So what have we noticed here? The first situation reminds us of a linear loop, while the second and third situation pretty much defines a nested loop! Let's look at some code examples.

Note, a lot of these code samples don't really do anything useful, they are written in ways to illustrate my point.

### O(1)

void SetSalary(int[] salaryList) { personSalary = salaryList[0]; }

The above code snippet is O(1) because no matter how big the input of N is
(i.e. an array of N items), we will always just perform one constant operation in the routine.
In this case, we just set the `personSalary`

variable to the first item of
the `salaryList`

array.

So when we say an algorithm runs in O(1), it means that we should always expect our algorithm to complete in some sort of constant time no matter how large the input is. Larger inputs do not make our run time grow.

### O(log N)

int BinarySearch(int[] data, int start, int end, key) { if(start > end) { return -1; } else { int mid = start + ((end - start) / 2); if(key < data[mid]) { return BinarySearch(data, start, mid - 1); } else if(key > data[mid]) { return BinarySearch(data, mid + 1, end); } else { return mid; } } }

Binary search is a popular O(log N) algorithm. When run time of an algorithm is said to be O(log N),
we are saying that for each call, we are going through roughly *half* the amount of data in total
as compared to the previous call.

With binary search, we are given a key to search for along with a start and end index with a sorted data set. This is usually an array. Based on the start and end indices, we can make a determination of the mid point and perform comparison against the mid point and the key in a sorted array to let us know which direction to take on the next iteration to find the key.

For example, if our key was less than the mid point, then the next iteration would only search against the start index up until the mid point - 1. Conversely, if the key was greater than the mid point, we would start our search from the index mid point + 1 to the end index. Every subsequent iteration of the search will then use the new start and end indices to calculate the new mid point and make a comparison — thus halving the amount of data to be considered. When the mid point is equal to the key value, then we have found our key — otherwise, it does not exist.

*
For O(log N) we are iterating through half the amount of the data in each iteration.
*

### O(N)

void PrintSalaries(int[] salaryList) { int i; for(i = 0; i < TOTAL_SALARIES; i++) { printf("%d\n", salaryList[i]); } }

As we can see here, given an array with N items, we have a loop that goes through each item in the passed array. The algorithm above is O(N) because given N items, we go through N items in the data set in our routine.

*
We can simply say that we enumerate each item once in the data set N times.
*

### O(N^2)

void SortSalaries(int[] salaryList) { int i, j; for(i = 0; i < TOTAL_SALARIES; i++) { for(j = 0; j < (TOTAL_SALARIES - i); j++) { if(salaryList[j] > salaryList[j + 1]) { int temp = salaryList[j]; salaryList[j] = salaryList[j + 1]; salaryList[j + 1] = temp; } } } PrintSalaries(salaryList); }

The above here is a good example of an O(N^2). We perform a sort operation on an array. Though inefficient, this sorting routine is O(N^2) at worst in that we perform N^2 operations sorting each element in an array. Don't worry too much about the implementation here. Just know that we go through the array N^2 times during the sort. If we want to be really specific here, the Big-O runtime is actually O(N^2 + N) due to the extra printSalaries call at the end of the function! We will discuss more about dropping constants and insignificant terms in the next section.

*
In summary an O(N^2) run time means for each item in our data set we are enumerating over
N more values in the same or another data set. Giving us N total operations multiplied by
N total operations.
*

Suppose we want to find all the combinations of 5 pairs of socks. We will take each sock and pair it up with another sock giving us a total of 25 different combinations. The growth function of something like O(N^3) works similarly: we are enumerating a data set and going through 2 others to find every possible permutation.

Almost always, we try to avoid O(N^2) when we can, as it can be deceptively slow once the amount of input increases to a very large amount.

### O(2^N)

int fib(int n) { if(n == 0 || n == 1) { return n; } else { return fib(n - 2) + fib(n - 1); } }

I have just included a O(2^N) algorithm here for fun. The above function calculates a fibonacci number recursively. One call to this function is already causes us to call the function again (n-2) and then (n-1) times. Repeat this multiple times and for N times, and we will have something like this:

```
FIB(N-1) + FIB(N-2) => FIB(N-2) + FIB(N-3) => FIB(N-3) + FIB(N – 4) ...
```

We make each call to the function TWICE, and each of the function call we process things N-X times. So eventually each child call will cause the subsequent recursive calls to the amount of 2^N.

### Dropping Constants and Terms

When coming up with a growth function, it is usually more practical to give a generalized growth function rather than something very specific. Since we are interested in the order of magnitude of the worst case running time of an algorithm, we do not need to concern ourselves with constants and smaller terms within the growth function if we assume that N is some large number.

In practicality, you would never tell your friend that the going from Los Angeles to Phoenix is about 372.1 miles. You’d be more likely to say Los Angeles to Phoenix is "about 350 miles or so". That is the same train of thought in determining Big-O complexity.

For example, suppose we have a growth function of O((3/2)*N^3 + 2N^2 + N) and N = 1,000,000. If we want to be specific, let’s say this example in dropping insignificant terms looks like this:

O((3/2)*N^3 + 2N^2 + N) => ~ O(N^3 + N^2 + N) => ~ O(N^3) O(N^3)

To prove that there is not much of an order of magnitude difference between O((3/2)*N^3 + 2N^2 + N) and O(N^2) when N = 1,000,000, let’s evaluate:

O((3/2)*N^3 + 2N^2 + N) when N = 1000000 = O((3/2)*(1000000)^3) + 2*(1000000)^2 + 1000000) = 1.5 x 10^18 O(N^3) = O(1000000^3) = 1 x 10^18

We can now see that the order of magnitude for both growth functions are both 10^18. Therefore, both growth functions perform relatively the same amount of time in the bigger picture!

### Factoring Space

Big-O isn't just limited to run time. We can also use it to refer to space consumption an algorithm uses for its operations.

For example, if a bracket validator is implemented using a stack to keep track of well-formed brackets in a text file, we are essentially using a stack to keep track of N brackets at any given moment in time. This means we are using O(N) memory.

Single variables use O(1), while things such as matrices (2D arrays) would use in the order of O(N^2) of space.

It is just good to know that when referring to Big-O complexity, we are not just speaking in terms of run time, but we can make the notion of space too!

### Big-Theta, Big-Omega

We are not only limited to Big-O analysis, but we also have concepts of Big-Theta and Big-Omega. Though, I am not going to cover these in much detail here as I will probably save these concepts for a later discussion. Just know that Big-Omega is the lower bound, of a growth function and Big-Theta is essentially the intersection, or in this case "in-between" of Big-Omega and Big-O. In laymen's terms, the best case of the worst and the average case of the worst respectively.

### Realistic Expectations and Conclusion

When optimizing an algorithm, we tend to get caught up with trying to make our algorithm faster and faster, we tend to forget if it is even technically feasible an algorithm can go faster, or if rewriting an algorithm to be faster would result in a negative impact in code maintainability. If the factors above are true, then in my opinion, it is best an algorithm is left alone. Especially when it is technical complexity is non-trivial.

The rule of thumb I follow for the above is if the algorithm works in a real world scenario, and is sufficient for the end user and is non-mission critical --- focus on the bigger picture and write better software overall or stabilize code elsewhere with your time.

That's all for today! If you made it through this article, I hope you enjoyed reading it!