Analysis of algorithms

A good algorithm will either be space or time efficient. There are two main ways to test the efficiency of algorithms - experimental and theoretical analysis.

Experimental analysis involves implementing and running an algorithm to determine how much time and space it takes to run. This type of analysis requires the algorithm to be implemented, all types of input to be tested and consistent system performance. In reality, it might not be possible to implement an algorithm, it's almost impossible to consider all types of input and different systems, even those with the same hardware and software, will still not necessarily have consistent run times.

Theoretical analysis uses mathematical methods to determine the asymptotic run time of an algorithm. It uses an abstract description of the algorithm as opposed to a concrete implementation and characterises the time or space complexity as a function of the input size. This allows analysis of an algorithm independent of hardware and software.

Asymptotic analysis

The running time of an algorithm can be split into three cases - best, worst and average. The best case is the minimum time an algorithm takes to run and is not that useful. The worst case is the maximum time an algorithm takes to run for any input. Average case determines how long it typically takes for an algorithm to run, somewhere between best and worst. This can be difficult to determine.

The best, worst and average case can be expressed using asymptotic notation. This ignores constant factors and lower order terms in a function to focus on the main part which affects its rate of growth.

Big-O notation

Big-O notation is used to characterise the worst case complexity of an algorithm. For two functions $f(n)$ and $g(n)$ it can be said that $f(n)$ is $O(g(n))$ if there exists $c>0$ and $N\geq1$ such that $$ f(n) \leq cg(n)\quad\forall n \geq N $$ Big-O notation gives an upper bound on the growth rate of a function as $n$ tends towards infinity. $f(n)$ is $O(g(n))$ means the growth rate of $f(n)$ is no more than that of $g(n)$.

Big-Omega

Big-Omega notation can be used characterise the best case complexity of an algorithm. Formally, $f(n)$ is $\Omega(g(n))$ if there exists $c>0$ and $N\geq 1$ such that $$ f(n) \geq cg(n) \quad \forall n \geq N $$ This is the opposite of big-O and gives the lower bound of the growth rate of $f(n)$.

Big-Theta

Big-Theta notation gives the average case complexity of a function. Formally, $f(n)$ is $\Theta(g(n))$ if there exists $c',c'' > 0$ and $N \geq 1$ such that $$ c'g(n) \leq f(n) \leq c''g(n) \quad \forall n \geq N $$ Big-Theta gives an upper and lower bound on the growth rate of $f(n)$ and is basically a combination of both big-O and big-Omega.

Examples

$2n+10$ is $O(n)$ because $$ \begin{aligned} 2n + 10 &\leq cn\\ (c-2)n &\geq 10\\ n &\geq \frac{10}{c-2} \end{aligned} $$ So the constants could be $c=3$ and $N = 10$ or anything else such that the inequality holds.
$n^2$ is not $O(n)$ because $$ \begin{aligned} n^2 &\leq cn\\ n &\leq c \end{aligned} $$ and this inequality can't be satisfied because $c$ is a constant.
$3n^3 + 20n^2 + 5$ is $O(n^3)$ because $$ \begin{aligned} 3n^3 + 20n^2 + 5 &\leq cn^3\\ 3 + \frac{20}{n} + \frac{5}{n^3} &\leq c \end{aligned} $$ which holds when $c = 4$ and $n \geq 21$ as $$ \frac{20}{n} + \frac{5}{n^3} \leq 1 $$ for all $n \geq 21$.
$5n^2$ is $\Omega(n^2)$ because $$ 5n^2 \geq cn^2 $$ for $c = 5$, $n \geq 1$.
$5n^2$ is $\Omega(n)$ because $$ 5n^2 \geq cn $$ for $c = 1$ and $n \geq 1$.
$5n^2$ is $\Theta(n^2)$ as it is both $O(n^2)$ and $\Omega(n^2)$.
$5n^2 + 3n\log n + 2n + 5$ is $O(n^2)$ because $$ 5n^2 + 3n\log n + 2n + 5 \leq (5+3+2+5)n^2 = cn^2 $$ when $c = 15$ and $n \geq 1$.
$3\log n + 2$ is $O(\log n)$ because $$ 3\log n + 2 \leq c\log n $$ when $c = 5$ and $n \geq 2$. Note that $n = 1$ will give $\log n = 0$ so $n \geq 2$ is necessary.
$2^{n+1}$ is $O(2^n)$ because $$ 2^{n+1} = 2 \cdot 2^n \leq c \cdot 2^n $$ when $c = 2$ and $n \geq 1$.
$3n\log n -2n$ is $\Omega(n\log n)$ because $$ 3n \log n - 2n = n\log n +2n(\log n - 1) \geq cn \log n $$ when $c = 1$ and $n \geq 2$.
$(n+1)^5$ is $O(n^5)$ because using the binomial expansion $$ (n+1)^5 = \binom{5}{5}n^5 + \binom{5}{4}n^4 + ... + 1 \leq cn^5 $$ when $c = \sum_{n=0}^5 \binom{5}{n}$ and $n \geq 1$.
$n$ is $O(n\log n)$ because $$ \begin{aligned} n &\leq cn\log n\\ 1 &\leq c\log n \end{aligned} $$ when $c = 4$ and $n \geq 2$.
$n^2$ is $\Omega(n\log n)$ because $$ \begin{aligned} n^2 &\geq cn \log n\\ n &\geq c\log n \end{aligned} $$ when $n \geq 1$ and $c = 1$.
$n \log n$ is $\Omega(n)$ because $$ \begin{aligned} n\log n &\geq cn\\ \log n &\geq c \end{aligned} $$ when $c=1$ and $n \geq 10$.

Conclusion

$f(n)$ is $O(g(n))$ if $f(n)$ is asymptotically less than or equal to $g(n)$
$f(n)$ is $\Omega(g(n))$ if $f(n)$ is asymptotically greater than or equal to $g(n)$
$f(n)$ is $\Theta(g(n))$ if $f(n)$ is asymptotically equal to $g(n)$

Computer Science Notes