Matrices

A matrix is a rectangular array of objects. The order of a matrix is $m \times n$ where $m$ is the number of rows and $n$ is the number of columns. To find the sum of two matrices add the corresponding elements.
Matrices can only be multiplied if the number of columns in the first is equal to the number of rows in the second. The product $AB = [c_{ij}]_{m \times n}$ of $A=[a_{ij}]_{m \times p}$ and $B = [b_{ij}]_{p \times n}$ is the $m \times n$ matrix where the $ij^\text{th}$ element of $AB$ is the scalar product of the $i^\text{th}$ row vector of $A$ with the $j^\text{th}$ column vector of $B$.
$c_{ij} = \sum_{r=1}^{p}{a_{ir}b_{rj}} = (a_{i1}, a_{i2}, ..., a_{ip}).(b_{1j}, b_{2j}, ..., b_{pj})$.

A square matrix has the same number of rows and columns. Products $AB$ and $BA$ only exist if both are square. Matrix multiplication is not commutative. $AB \neq BA$.

A diagonal matrix is a square matrix in which only diagonal elements are non-zero. The identity matrix is a diagonal matrix where all elements are 1.

Properties of matrix multiplication

$(AB)C = A(BC)$
$A(B+C) = AB+AC$
$(A+B)C = AC+BC$
$IA = A = AI$
$OA = O = AO$
$A^pA^q = A^{p+q} = A^qA^p$
$(A^p)^q = A^{pq}$

Transpose

The transpose, $A^T$, of a matrix $A$ is obtained by interchanging the rows and columns.
Transpose properties:

$(A^T)^T = A$
$(A+B)^T = A^T + B^T$
$(\lambda A)^T = \lambda A^T$
$(AB)^T = B^TA^T$

Matrix inverse

Matrix $B$ is the inverse of matrix $A$ if $A$ and $B$ are square matrices of the same order and $AB=I=BA$. Not all square matrices have an inverse. If the discriminant of a matrix is 0 it has no inverse.

Linear equations

Linear simultaneous equations can be solved with matrices.
Consider $ax+by=p$ and $cx+dy=q$. This can be expressed as matrices:

$$ \begin{bmatrix} a & b\\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} p\\ q \end{bmatrix} $$ The solutions to these equations ($x$ and $y$) are given by the inverse or something.

Matrices $A$ and $B$ are row equivalent ($A~B$) if $A$ can be transformed to $B$ using a finite number of elementary row operations. They can also be solved with augmented matrices but I can't be bothered to type that up.

Row echelon form

A matrix is in row echelon form if the first nonzero entry in each row is further to the right than the first nonzero entry in the previous row.

Elementary matrices

Elementary row operations can be performed by multiplying a matrix on the left by a suitable elementary matrix.

$$ \begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a & b & c\\ d & e & f\\ g & h & i \end{bmatrix} = \begin{bmatrix} d & e & f\\ a & b & c\\ g & h & i \end{bmatrix} $$ The first matrix causes the first row to become the second and the second to become the first.

$$ \begin{bmatrix} \lambda & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a & b & c\\ d & e & f\\ g & h & i \end{bmatrix} = \begin{bmatrix} \lambda a & \lambda b & \lambda c\\ d & e & f\\ g & h & i \end{bmatrix} $$ This matrix causes the first row to be multiplied by $\lambda$. $$ \begin{bmatrix} 1 & \mu & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a & b & c\\ d & e & f\\ g & h & i \end{bmatrix} = \begin{bmatrix} a + \mu d & b + \mu e & c + \mu f\\ d & e & f\\ g & h & i \end{bmatrix} $$ This elementary matrix adds $\mu$ lots of row 2 to row 1.

The elementary matrices are variations of the identity matrix.
Generally, the elementary matrices are defined as follows:

$E_{ij}$ is obtained from the identity matrix by swapping rows $i$ and $j$
$E_i(\lambda)$ is obtained from $I$ by multiplying the entries in row $i$ by $\lambda$
$E_{ij}(\mu)$ is obtained from $I$ by adding $\mu$ times row $j$ to row $i$

These correspond to the examples above.
Every elementary matrix has an inverse. This is proven in the lecture slides.

Determinants and inverse matrices

The determinant of a $2 \times 2$ matrix $\begin{bmatrix} a & b\\ c & d \end{bmatrix}$ is $ad-bc$.
$\operatorname{det}(AB) = \operatorname{det}(A)\operatorname{det}(B)$
The product of the determinants of any number of matrices is the determinant of their product.

The determinant of a $3 \times 3$ matrix $\begin{bmatrix} a & b & c\\ d & e & f\\ g & h & i \end{bmatrix} = a \begin{vmatrix} e & f\\ h & i \end{vmatrix} - b \begin{vmatrix} d & f\\ g & i \end{vmatrix} + c \begin{vmatrix} d & e\\ g & i \end{vmatrix}$
It is possible to rearrange the $3 \times 3$ determinant to get coefficients from the second and third rows.
Second row: $-d \begin{vmatrix} b & c\\ h & i \end{vmatrix} + e \begin{vmatrix} a & c\\ g & i \end{vmatrix} - f \begin{vmatrix} a & b\\ g & h \end{vmatrix}$
Third row: $g \begin{vmatrix} b & c\\ e & f \end{vmatrix} -h \begin{vmatrix} a & c\\ d & f \end{vmatrix} + i \begin{vmatrix} a & b\\ d & e \end{vmatrix}$
The signs of the determinants follow the pattern $$ \begin{bmatrix} + & - & +\\ - & + & -\\ + & - & + \end{bmatrix} $$

The determinant of the transpose matrix is the same as the determinant of the matrix. There is no change.

Minors and cofactors of $n \times n$ matrices

The $ij$th minor $M_{ij}$ of an $n \times n$ matrix $A = [a_{ij}]$ is the determinant of the $(n-1)\times(n-1)$ matrix obtained by deleting the $i$th row and $j$th column.

The $ij$th cofactor $A_{ij}$ of $A$ is defined by $A_{ij} = (-1)^{i+j}M_{ij}$.

The determinant of an $n \times n$ matrix $A = [a_{ij}]$ is $|A| = \Sigma(a_{ij} A_{ij})$ where $A_{ij}$ is the $ij$th cofactor of $A$.

Adjoint and inverse matrices

The matrix of cofactors is the matrix obtained by replacing each element of the matrix with it's cofactor.
The adjoint of a matrix $\operatorname{adj}(A)$ is the transpose of it's matrix of cofactors.

A matrix can be inverted is it's determinant is non-zero.
The inverse of a matrix is $A^{-1} = \frac{1}{|A|} \operatorname{adj}(A)$.

The inverse matrix can be used to solve simultaneous equations. Look at the lecture notes.

Determinants continued

If $B$ is the matrix obtained from $A$ by

multiplying a row of $A$ by a number $\lambda$, then $|B| = \lambda |A|$
interchanging two rows of A, then $|B| = -|A|$
adding a multiple of one row to another, then $|B| = |A|$.

Since $|A|=|A^T|$, performing elementary row operations on $A^T$ is equivalent to performing the corresponding column operations on $A$.

A set of $n$ vectors in $\mathbb{R}^n$ is linearly independent (and hence a basis) if and only if it is the set of column vectors of a matrix with nonzero determinant.

Linear transformations

A function $T:\mathbb{R}^m \rightarrow \mathbb{R}^n$ is a linear transformation if $T(u+v)=T(u)+T(v)$ and $T(\lambda u) = \lambda T(u)$ for all $u,v \in \mathbb{R}^m$ and all $\lambda \in \mathbb{R}$.
$T$ is a linear transformation if $T(0)=0$.

Consider $T(x,y) = (x+y, x-y)$. Let $a=(a_1,a_2)$ and $b = (b_1,b_2)$. $T(a) = (a_1+a_2, a_1-a_2)$ and $T(b) = (b_1+b_2, b_1-b_2)$.
Hence $T(a+b) = T(a_1+b_2, a_2+b_2) = (a_1+b_1+a_2+b_2, a_1+b_1-a_2-b_2) = (a_1+a_2, a_1-a_2) + (b_1+b_2, b_1-b_2) = T(a) + T(b)$. It is also true that $T(\lambda a) = \lambda T(a)$ so $T$ is a linear transformation.

Consider $T(x,y) = (x+1, y-1)$. $T((0,0)) = (1,-1) \neq (0,0)$ so $T$ is not a linear transformation.

Consider $T(x,y) = (x^2,y^2)$. This is not a linear transformation because $T(\lambda a) \neq \lambda T(a)$.

Projection

We define the projection of $x \in \mathbb{R}^2$ onto nonzero vector $u \in \mathbb{R}^2$ to be the vector $P_u(x)$ with the properties $P_u(x)$ is a multiple of $u$ and $x-P_u(x)$ is perpendicular to $u$. Using these properties $P_u(x) = x.u \left( \frac{1}{|u|^2} \right) u$.

The projection function is a linear transformation. Proof in lecture slides.

Rotation about the origin

The linear transformation $R_\theta: \mathbb{R}^2 \rightarrow \mathbb{R}^2$ describes the anticlockwise rotation of a point through an angle $\theta$.
$R_\theta = \begin{bmatrix} \cos \theta & -\sin \theta\ \sin \theta & \cos \theta \end{bmatrix}$.

Linear transformations cont.

Every matrix defines a linear transformation.
Given an $n \times m$ matrix $M$, the function $T: \mathbb{R}^m \rightarrow \mathbb{R}^n$ defined by $T(x) = Mx$ for all $x \in \mathbb{R}^m$ is a linear transformation.

Eigenvalues and eigenvectors

An eigenvector of a square matrix $M_{nn}$ is a vector $r \in \mathbb{R}^n$ whose direction does not change when multiplied by $M$. Equivalently, $Mr = \lambda r$ for some $\lambda \in \mathbb{R}$. $\lambda$ is the eigenvalue corresponding to eigenvector $r$.

For example, the $2 \times 2$ matrix $\begin{bmatrix}5 & 3 \\ 3 & 5 \end{bmatrix}$ has eigenvectors $\begin{bmatrix}1\\1\end{bmatrix}$ and $\begin{bmatrix}1\\-1\end{bmatrix}$ with eigenvalues 8 and 2 respectively.

A number $\lambda \in \mathbb{R}$ is an eigenvalue of the matrix $M$ if $|M-\lambda I| = 0$. This is called the characteristic equation of $M$. It is a polynomial of degree $n$ in $\lambda$.
Proof:
$\lambda$ is an eigenvalue of $A$
$Av = \lambda v$ for some nonzero $v$
$(A - \lambda I)v = 0$
$|A-\lambda I| = 0$ because $v$ is non-zero, so $(A-\lambda I)$ can't be invertible or that would mean $v=0$ which is a contradiction. Therefore $|A-\lambda I| = 0$ so it can't be inverted.

The eigenvectors and eigenvalues of $\begin{bmatrix} -5 & 3 \\ 6 & -2 \end{bmatrix}$ can be found using $|A-\lambda I| = 0$: $$ \begin{aligned} &|A-\lambda I| = 0\\ &\begin{vmatrix} -5 - \lambda & 3\\ 6 & -2-\lambda \end{vmatrix} = (-5-\lambda)(-2-\lambda)-18 = 0\\ &\lambda^2 +7\lambda -8 = 0\\ & \lambda = 1 \text{ or } -8 \text{ (eigenvalues)}\\ & (A- \lambda I)v = 0\\ & \text{When } \lambda = 1 \text{:}\\ & A - \lambda I = \begin{bmatrix} -6 & 3 \\ 6 & -3 \end{bmatrix}\\ &\begin{bmatrix} -6 & 3 \\ 6 & -3 \end{bmatrix} \begin{bmatrix}v_1 \\ v_2\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix}\\ &-6v_1 + 3v_2 = 0\\ &6v_1-3v_2 = 0\\ &v_2 = 2v_1\\ & \text{When } \lambda = -8 \text{:}\\ & \begin{bmatrix} 3 & 3 \\ 6&6\end{bmatrix} \begin{bmatrix}v_1 \\v_2\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix}\\ & 3v_1 + 3_v2 = 0\\ & v_2 = -v_1\end{aligned} $$ Any non-zero vector $(v_1, 2v_1)$ is an eigenvector when $\lambda = 1$ and any non-zero vector $(v_1, -v_1)$ is an eigenvector when $\lambda = -8$.

The eigenvalues of a real matrix may be complex.

Diagonalising a matrix

Let $A$ be an $n \times n$ matrix with eigenvectors $v_1, v_2, ..., v_n$ and corresponding eigenvalues $\lambda_1, \lambda_2, ... , \lambda_n$.
Let $V$ be the matrix whose columns are $v_1, v_2, ..., v_n$.
$V^{-1}AV = D$ where $D = \operatorname{diag}(\lambda_1, \lambda_2, ..., \lambda_n)$.

Computer Science Notes