six linear algebra theorems

Linear Algebra Theorems

Linear Algebra Theorems

Izak, June 2023

The six central theorems of linear algebra come from Gilbert Strang's Introduction to Linear Algebra, 5th Ed, and I have provided an example for each.

Note: some HTML was started with ChatGPT, but LLM's cannot do math or even consistent LaTeX, so much of the math was checked via Wolfram Alpha, and proofs were checked via Math Stack Exchange. The CSS requires Bootstrap, and the LaTeX requires MathJax, so this page is best viewed with internet connection ♡. Finally, Prof. Strang uses "nullspace", where I use "kernel" — they're synonyms, but "kernel" is often used outside of linear algebra too.


Dimension Theorem

All bases for a vector space have the same number of vectors.

Mathematically: \( \text{dim}(V) = \text{dim}(W) \) for any bases \( V \) and \( W \) of the vector space.

Example:

This is sort of a boring example, but the available proofs give an idea for how nuanced the mechanisms underlying this theorem are. See some discussion here: Dimension Theorem discussion

Let's consider a vector space \(V = \mathbb{R}^2\) over the field \(F = \mathbb{R}\) (the set of real numbers). In this case, vectors in \(V\) are ordered pairs \((x, y)\) where \(x\) and \(y\) are real numbers.

Now, let's find two different bases for \(V\) and observe that they have the same number of vectors.

Basis 1:

We can choose the following two vectors as a basis for \(V\):

\(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\)

\(\mathbf{v}_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\)

These vectors are linearly independent (meaning no non-trivial linear combination of them yields the zero vector) and span the entire vector space \(V\) — that is, \(\forall v \in V \ \exists \ \lambda_1 , \ \lambda_2 \in F \ | \ v= \lambda_1 v_1 + \lambda_2 v_2 \)

Basis 2:

Alternatively, we can choose the following two vectors as another basis for \(V\):

\(\mathbf{u}_1 = \begin{bmatrix} 2 \\ 1 \end{bmatrix}\)

\(\mathbf{u}_2 = \begin{bmatrix} -1 \\ 3 \end{bmatrix}\)

Again, these vectors are linearly independent and span the entire vector space \(V\).

Both Basis 1 and Basis 2 consist of two vectors each. This example demonstrates that all bases (ok, at least two bases) for \(V\) have the same number of vectors, which in this case is 2. This property holds true for any vector space, indicating that the number of vectors in a basis is a fundamental characteristic of the vector space itself.


Counting Theorem

Dimension of column space + dimension of kernel = number of columns.

Mathematically: \( \text{dim}(\text{col}(A)) + \text{dim}(\text{ker}(A)) = \text{cols}(A) \)

Example:

Consider the matrix:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

Let's calculate the dimension of the column space (step 1) and the dimension of the kernel (step 2) of \(A\), and verify the theorem.

Solution:

STEP 1: To find the column space of \(A\), we reduce \(A\) to echelon form:

\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \xrightarrow{\text{Row operations}} \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Do row reduction: \[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

Subtract a multiple of one row from another.
Subtract \(4 \times\) (row 1) from row 2:
\[ \begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

Subtract a multiple of one row from another.
Subtract \(7 \times\) (row 1) from row 3:
\[ \begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & -6 & -12 \\ \end{bmatrix} \]

Swap two rows.
Swap row 2 with row 3:
\[ \begin{bmatrix} 1 & 2 & 3 \\ 0 & -6 & -12 \\ 0 & -3 & -6 \\ \end{bmatrix} \]

Subtract a multiple of one row from another.
Subtract \(\frac{1}{2} \times\) (row 2) from row 3:
\[ \begin{bmatrix} 1 & 2 & 3 \\ 0 & -6 & -12 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Divide row 2 by a scalar.
Divide row 2 by -6:
\[ \begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Subtract a multiple of one row from another.
Subtract \(2 \times\) (row 2) from row 1:
\[ \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Verify matrix is reduced.
This matrix is now in reduced row echelon form.
All nonzero rows are above rows of all zeros:
\[ \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Verify pivots and their positions.
Each pivot is 1 and is strictly to the right of every pivot above it:
\[ \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

Checked with Wolfram ♡

The pivot columns are the first two columns, and they form a basis for the column space of \(A\). So, the dimension of the column space is \(2\).

STEP 2: Now, let's find the kernel of \(A\) by solving the homogeneous equation \(A\mathbf{x} = \mathbf{0}\):

\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \\ \end{bmatrix} \]

The kernel of a matrix \(M\) is the set of solutions \(v\) to the homogeneous equation \(M \cdot v = 0\).
The kernel of matrix \(M = \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}\) is the set of all vectors \(v = (x_1, x_2, x_3)\) such that \(M \cdot v = 0\):
\(\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}\)

Identify free variables.
Free variables in the kernel \((x_1, x_2, x_3)\) correspond to the columns in \(\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}\) which have no pivot.
Column 3 is the only column with no pivot, so we may take \(x_3\) to be the only free variable.

Perform matrix multiplication.
Multiply out the reduced matrix \(\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}\) with the proposed solution vector \((x_1, x_2, x_3)\):
\(\begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} x_1 - x_3 \\ x_2 + 2x_3 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}\)

Convert to a system and solve in terms of the free variables.
Solve the equations \(x_1 - x_3 = 0\), \(x_2 + 2x_3 = 0\), and \(0 = 0\) for \(x_1\) and \(x_2\):
\(\{x_1 = x_3, x_2 = -2x_3, 0 = 0\) for \(x_1\) and \(x_2 \}\)

Replace the pivot variables with free variable expressions.
Rewrite \(v\) in terms of the free variable \(x_3\), and assign it an arbitrary real value of \(x\):
\(v = (x_1, x_2, x_3) = (x_3, -2x_3, x_3) = (x, -2x, x)\) for \(x \in \mathbb{R}\)

Rewrite the solution vector \(v = (x, -2x, x)\) in set notation:
Answer: \(\{(x, -2x, x) : x \in \mathbb{R}\}\)

Checked with Wolfram ♡

Thus, the nullity —or dimension of the kernel— is one, \( \text{dim}(\text{ker}(A)) = 1 \)

Now, by the theorem, the dimension of the column space plus the dimension of the kernel should be equal to the number of columns, \( \text{dim}(\text{col}(A)) + \text{dim}(\text{ker}(A)) = \text{cols}(A) \)

Here, \( \text{dim}(\text{col}(A)) = 2 \) and \( \text{dim}(\text{ker}(A)) = 1 \) so \( \text{cols}(A) = 2 + 1 = 3 \). The number of columns in \(A\) is also \(3\).


Rank Theorem

Dimension of column space = dimension of row space.

Mathematically: \( \text{dim}(\text{col}(A)) = \text{dim}(\text{row}(A)) \)

Example:

Consider the matrix:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

Let's calculate the dimensions of the column space and the row space of \(A\), and verify the theorem.

Solution:

First, consider the column space and then, second, consider the row space.

For both the column and row spaces, we reduce \(A\) to echelon form exactly as in step 1 of the example for the Counting Theorem:

\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \xrightarrow{\text{Row operations}} \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \\ \end{bmatrix} \]

For the column space: the pivot columns of this matrix \(A\) are the first column and the second column — where \(1\) is the only nonzero element of the pivot columns in row-reduced echelon form \(\text{rref}(A)\).

Thus, the second and first columns of the orignial matrix \( \begin{bmatrix} 1 & 4 & 7 \\ \end{bmatrix} \) and \( \begin{bmatrix} 2 & 5 & 8 \\ \end{bmatrix} \) form a basis for the column space of \(A\). So, the dimension of the column space is \( \text{dim}(\text{col}(A)) = 2 \).

The row space comes from the non-zero rows of the row-reduced echelon matrix. The row space of \(A\) is spanned by \( \begin{bmatrix} 1 & 0 & -1 \\ \end{bmatrix} \) and \( \begin{bmatrix} 0 & 1 & 2 \\ \end{bmatrix} \), so they form a basis for the row space of \(A\), and the dimension of the row space is \( \text{dim}(\text{row}(A)) = 2 \).

We see \( \text{dim}(\text{col}(A)) = \text{dim}(\text{row}(A)) = 2 \).
Q.E.D.

That ends the example, but some rigorous (albiet a tad daunting) proofs are here: Rank Theorem Proofs


Fundamental Theorem

The row space and kernel of \( A \) are orthogonal complements in \( \mathbb{R}^n \).

Mathematically: \( \text{row}(A) \perp \text{ker}(A) \) in \( \mathbb{R}^n \)

Example:

Consider the matrix:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

Let's calculate the row space (step 1) and the kernel (step 2) of \(A\) and verify that they are orthogonal complements in \(\mathbb{R}^n\).

Solution:

This one requires a bit more thinking outside of the row reduction algorithm

Recall from The Counting Theorem, the kernel has dimension equal to the difference between the number of columns and the dimension of the row space, \(\text{col}(A) - \text{dim}(\text{row}(A))\). It follows that the orthogonal complement of the kernel has dimension \(\text{col}(A) - (\text{col}(A) - \text{dim}(\text{row}(A)))=\text{dim}(\text{row}(A))\). Now, the row space is an \(r\) dimensional subspace of the orthogonal complement of the kernel, which in turn has dimension \(r\). The only \(r\) dimensional subspace of an \(r\)-dimensional space is the entirety of the space itself, \( A \subseteq B, \ \text{dim}(A)=\text{dim}(B) \vdash B \subseteq A, \ A=B\). So, the row-space is not only a subspace of the orthogonal complement but comprises the entirety of the orthogonal complement.

Also Recall from The Counting Theorem, the kernel is \(\{(x, -2x, x) : x \in \mathbb{R}\}\).

Similarly, from the row reduction of the matrix, we know that the basis for the row space is \( \{ \begin{bmatrix} 1 & 0 & -1 \end{bmatrix}, \ \begin{bmatrix} 0 & 1 & 2 \end{bmatrix} \} \)

Now, orthogonal vectors have an inner product equal to zero \( x^T y = y^T x = 0\). Spaces are orthogonal when every vector in one is orthogonal to every vector in the other.

How to check that \( \forall n \in \{(x, -2x, x) : x \in \mathbb{R}\} \) and \( \forall r \in \) the row space which is in some ways more abstract:

For a vector space \(V\), a family in \(V\) consists of a set \(I\) together with a function \(e: I \rightarrow V\). A basis of \(V\) is a family \((I, e)\) in \(V\) such that for all \(x \in V\) there exists a unique finitely-supported function \(a: I \rightarrow \mathbb{R}\) satisfying \(x = \sum_{i \in I} a_i e_i \ \) as well as conditions for well ordering (See this definition's source for further context on well ordering and matrices).

And that is not even considering order, which is an essential part of elimination on matrices. All the same, the definition can be even more intuitive if simply considered \(x = \sum a e\) to mean all the linear combinations of \( \begin{bmatrix} 1 & 0 & -1 \end{bmatrix} \) and \( \begin{bmatrix} 0 & 1 & 2 \end{bmatrix}\).

Even more intuitively, the kernel can be seen as a line or a one dimensional subspace of \( \mathbb{R}^3 \), the row space a plane or two dimensional subspace of \( \mathbb{R}^3 \).

Thus, the normal vector to \(\text{row}(A)\) is \( \begin{bmatrix} 1 & 0 & -1 \end{bmatrix} \times \begin{bmatrix} 0 & 1 & 2 \end{bmatrix} \)

Compute the following cross product:

\((1, 0, -1) \times (0, 1, 2)\)

Create a matrix out of the vectors \((1, 0, -1)\) and \((0, 1, 2)\) along with the unit vectors \(\hat{i}\), \(\hat{j}\), and \(\hat{k}\).

Construct a matrix where the first row contains unit vectors \(\hat{i}\), \(\hat{j}\), and \(\hat{k}\); and the second and third rows are made of vectors \((1, 0, -1)\) and \((0, 1, 2)\):

\[ \begin{bmatrix} \hat{i} & \hat{j} & \hat{k} \\ 1 & 0 & -1 \\ 0 & 1 & 2 \\ \end{bmatrix} \]

The cross product of the vectors \((1, 0, -1)\) and \((0, 1, 2)\) is the determinant of the matrix:

\[ \begin{vmatrix} \hat{i} & \hat{j} & \hat{k} \\ 1 & 0 & -1 \\ 0 & 1 & 2 \\ \end{vmatrix} \]

Take the determinant of this matrix:

\(\begin{vmatrix} \hat{i} & \hat{j} & \hat{k} \\ 1 & 0 & -1 \\ 0 & 1 & 2 \end{vmatrix}\)

Find an optimal row or column to use for Laplace's expansion.

Expand with respect to row 1:

The determinant of the matrix \(\begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ a_{3,1} & a_{3,2} & a_{3,3} \end{bmatrix}\) is given by \(\sum_{j=1}^{3}(-1)^{1+j}a_{1,j}M_{1,j}\) where \(M_{i,j}\) is the determinant of the matrix obtained by removing row \(i\) and column \(j\).

The determinant of the matrix \(\begin{bmatrix} \hat{i} & \hat{j} & \hat{k} \\ 1 & 0 & -1 \\ 0 & 1 & 2 \end{bmatrix}\) is given by \(\hat{i}\begin{vmatrix} 0 & -1 \\ 1 & 2 \end{vmatrix} + (-\hat{j})\begin{vmatrix} 1 & -1 \\ 0 & 2 \end{vmatrix} + \hat{k}\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}\):

\(=\hat{i}\begin{vmatrix} 0 & -1 \\ 1 & 2 \end{vmatrix} + (-\hat{j})\begin{vmatrix} 1 & -1 \\ 0 & 2 \end{vmatrix} + \hat{k}\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}\)

The determinant of the matrix \(\begin{bmatrix} a & b \\ c & d \end{bmatrix}\) is given by \(ad - bc\).

\(\hat{i}\begin{vmatrix} 0 & -1 \\ 1 & 2 \end{vmatrix} = \hat{i}\)

\(=(-\hat{j})\begin{vmatrix} 1 & -1 \\ 0 & 2 \end{vmatrix} + \hat{k}\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}\)

Compute the determinant of the matrix \(\begin{bmatrix} 1 & -1 \\ 0 & 2 \end{bmatrix}\) and multiply the result by \(-\hat{j}\).

\((-\hat{j})\begin{vmatrix} 1 & -1 \\ 0 & 2 \end{vmatrix} = -2\hat{j}\)

\(=\hat{i} - 2\hat{j} + \hat{k}\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix}\)

Compute the determinant of the matrix \(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\) and multiply the result by \(\hat{k}\).

\(\hat{k}\begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix} = \hat{k}\)

\(=\hat{i} - 2\hat{j} + \hat{k}\)

Collect the coefficients of \(\hat{i}\), \(\hat{j}\), and \(\hat{k}\) into a vector ordered as \((\hat{i}, \hat{j}, \hat{k})\).

\(\hat{i} - 2\hat{j} + \hat{k} = (1, -2, 1)\)

Checked with Wolfram ♡

We see the normal of \( \text{row}(A)\) is \( \hat{s} = \begin{bmatrix} 1 & -2 & 1 \end{bmatrix} \). Now, it is simple to see that \( \text{ker}(A) \) has direction vector \( \hat{s} = \begin{bmatrix} 1 & -2 & 1 \end{bmatrix} \) (i.e. factor out \(x\)).

Now, proving \( \text{row}(A) \perp \text{ker}(A) \) in \( \mathbb{R}^n \) is proving that the normal vector \( \hat{n} \) of \( \text{row}(A) \) is parallel to the direction vector \( \hat{s} \) of \( \text{ker}(A) \).

This parallelism, again, is true if \( \hat{n} \times \hat{s} = 0 \). But, we do not need to go this far because out results have yielded the same vector.

That is, \( ( \hat{n} = \hat{s} ) \Rightarrow ( \hat{n} \times \hat{s} = 0 ) \), and \( ( \hat{n} \times \hat{s} = 0 ) \Rightarrow (\hat{n} \parallel \hat{s}) \), so finally, \( \hat{n} \parallel \hat{s} \Rightarrow (\text{row}(A) \perp \text{ker}(A)) \).


Singular Value Decomposition

This is where the computations become quite intense.

There are orthonormal bases (\( v \)'s and \( u \)'s) for the row and column spaces so that \( Av_i = \sigma_iu_i \).

Mathematically: \( A = US V^T \) where \( U \) and \( V \) are orthonormal matrices, and \( S \) is a diagonal matrix of singular values.

A concise explanation (adapted from an MIT bio engineering tutorial):

Singular value decomposition takes a rectangular matrix(defined as \( A \), where \( A \) is an \( n \times p \) matrix) with \( n \) rows and \( p \) columns.

\[ A = U \cdot S\cdot V^T \]

Where:

\[ U^TU = I_{n \times n} \] \[ V^TV = I_{p \times p} \Rightarrow U \perp V\]

Where the columns of \( U \) are the left singular vectors; \(S\) (the same dimensions as \( A \)) has singular values and is diagonal; and \( V^T \) has rows that are the right singular vectors.

Calculating the SVD consists of finding the eigenvalues and eigenvectors of \( A^TA \) and \( AA^T \). The eigenvectors of \( AA^T \) make up the columns of \( V \), the eigenvectors of \( A^TA \) make up the columns of \( U \). Also, the singular values in \(S\) are the square roots of eigenvalues from \( AA^T \) or \( A^TA \). The singular values are the diagonal entries of the \(S\) matrix and are arranged in descending order. The singular values are always real numbers. If the matrix \( A \) is a real matrix, then \( U \) and \( V \) are also real.

Example:

While I would love to consider this matrix:

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix} \]

This yields a horrifying SVD that requires the \tiny command from LaTeX and the overflow-x: auto command for CSS:

Find \(M = U.Σ.V^†\) where... \[M = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \\ \end{bmatrix}\]

Apologies, I had to shrink this to fit it on the page: \( U =\tiny{ \begin{bmatrix} \frac{3 - \frac{2(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})}}{\sqrt{(9 - \frac{8(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{7(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (6 - \frac{5(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{4(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (3 - \frac{2(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2}} & \frac{3 - \frac{2(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)}}{\sqrt{(9 - \frac{8(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{7(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (6 - \frac{5(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{4(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (3 - \frac{2(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2}} & \frac{1}{\sqrt{6}} \\ \\ \frac{6 - \frac{5(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{4(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})}}{\sqrt{(9 - \frac{8(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{7(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (6 - \frac{5(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{4(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (3 - \frac{2(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2}} & \frac{6 - \frac{5(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{4(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)}}{\sqrt{(9 - \frac{8(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{7(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (6 - \frac{5(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{4(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (3 - \frac{2(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2}} & \frac{1}{\sqrt{6}} \\ \\ \frac{9 - \frac{8(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{7(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})}}{\sqrt{(9 - \frac{8(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{7(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (6 - \frac{5(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{4(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2 + (3 - \frac{2(-1223 - 13\sqrt{8881})}{3(477 + 5\sqrt{8881})} - \frac{(-1015 - 11\sqrt{8881})}{3(477 + 5\sqrt{8881})})^2}} & \frac{9 - \frac{8(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{7(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)}}{\sqrt{(9 - \frac{8(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{7(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (6 - \frac{5(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{4(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2 + (3 - \frac{2(1223 - 13\sqrt{8881})}{3(5\sqrt{8881} - 477)} - \frac{(1015 - 11\sqrt{8881})}{3(5\sqrt{8881} - 477)})^2}} & \frac{1}{\sqrt{6}} \\ \end{bmatrix}} \)

\(Σ = \begin{bmatrix} \sqrt{\frac{3}{2}(95 + \sqrt{8881})} & 0 & 0 \\ 0 & \sqrt{\frac{3}{2}(95 - \sqrt{8881})} & 0 \\ 0 & 0 & 0 \\ \end{bmatrix} \) \(V = \begin{bmatrix} -\frac{-1015 - 11\sqrt{8881}}{3(477 + 5\sqrt{8881})\sqrt{1 + \frac{(-1223 - 13\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2} + \frac{(-1015 - 11\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2}}} & -\frac{1015 - 11\sqrt{8881}}{3(5\sqrt{8881} - 477)\sqrt{1 + \frac{(1223 - 13\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2} + \frac{(1015 - 11\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2}}} & \frac{1}{\sqrt{6}} \\ -\frac{-1223 - 13\sqrt{8881}}{3(477 + 5\sqrt{8881})\sqrt{1 + \frac{(-1223 - 13\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2} + \frac{(-1015 - 11\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2}}} & -\frac{1223 - 13\sqrt{8881}}{3(5\sqrt{8881} - 477)\sqrt{1 + \frac{(1223 - 13\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2} + \frac{(1015 - 11\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2}}} & -\sqrt{\frac{2}{3}} \\ \frac{1}{\sqrt{1 + \frac{(-1223 - 13\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2} + \frac{(-1015 - 11\sqrt{8881})^2}{9(477 + 5\sqrt{8881})^2}}} & \frac{1}{\sqrt{1 + \frac{(1223 - 13\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2} + \frac{(1015 - 11\sqrt{8881})^2}{9(5\sqrt{8881} - 477)^2}}} & 0 \\ \end{bmatrix}^†\)

Note: † denotes the conjugate transpose

I also found this one with Wolfram ♡

INSTEAD, below I transcribed and explained a better example from Prof. Marshall Hampton at University of Minnesota Duluth (I merely trascribed & explained Prof. Marshall Hampton's idea — no plagiarism intended)!

Find the SVD of \(A, U S V^{T}\), where \[A=\begin{bmatrix}3 & 2 & 2 \\ 2 & 3 & -2\end{bmatrix}\]

First, we compute the singular values \(\sigma_{i}\) by finding the eigenvalues of \(A A^{T}\).

\[ A A^{T}=\begin{bmatrix}17 & 8 \\ 8 & 17\end{bmatrix} \]

Multiply the following matrices:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \]

Determine the dimension of the product. The dimensions of the first matrix are \(2 \times 3\) and the dimensions of the second matrix are \(3 \times 2\). This means the dimensions of the product are \(2 \times 2\):

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} \_ & \_ \\ \_ & \_ \end{bmatrix} \]

Find the entry in the 1\(^{\text{st}}\) row and 1\(^{\text{st}}\) column of the product matrix. First look at the 1\(^{\text{st}}\) row of the first matrix and the 1\(^{\text{st}}\) column of the second matrix.

Highlight the 1\(^{\text{st}}\) row and the 1\(^{\text{st}}\) column:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} \_ & \_ \\ \_ & \_ \end{bmatrix} \]

Multiply corresponding components of the highlighted row and highlighted column, then add.

Multiply corresponding components and add: \(3 \cdot 3 + 2 \cdot 2 + 2 \cdot 2 = 17\).

Place this number into the 1\(^{\text{st}}\) row and 1\(^{\text{st}}\) column of the product:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & \_ \\ \_ & \_ \end{bmatrix} \]

Find the entry in the 1\(^{\text{st}}\) row and 2\(^{\text{nd}}\) column of the product matrix. First look at the 1\(^{\text{st}}\) row of the first matrix and the 2\(^{\text{nd}}\) column of the second matrix.

Highlight the 1\(^{\text{st}}\) row and the 2\(^{\text{nd}}\) column:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & \_ \\ \_ & \_ \end{bmatrix} \]

Multiply corresponding components of the highlighted row and highlighted column, then add.

Multiply corresponding components and add: \(3 \cdot 2 + 2 \cdot 3 + 2 \cdot (-2) = 8\).

Place this number into the 1\(^{\text{st}}\) row and 2\(^{\text{nd}}\) column of the product:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & 8 \\ \_ & \_ \end{bmatrix} \]

Find the entry in the 2\(^{\text{nd}}\) row and 1\(^{\text{st}}\) column of the product matrix. First look at the 2\(^{\text{nd}}\) row of the first matrix and the 1\(^{\text{st}}\) column of the second matrix.

Highlight the 2\(^{\text{nd}}\) row and the 1\(^{\text{st}}\) column:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & 8 \\ 8 & \_ \end{bmatrix} \]

Multiply corresponding components of the highlighted row and highlighted column, then add.

Multiply corresponding components and add: \(2 \cdot 3 + 3 \cdot 2 + (-2) \cdot 2 = 8\).

Place this number into the 2\(^{\text{nd}}\) row and 1\(^{\text{st}}\) column of the product:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & 8 \\ 8 & \_ \end{bmatrix} \]

Find the entry in the 2\(^{\text{nd}}\) row and 2\(^{\text{nd}}\) column of the product matrix. First look at the 2\(^{\text{nd}}\) row of the first matrix and the 2\(^{\text{nd}}\) column of the second matrix.

Highlight the 2\(^{\text{nd}}\) row and the 2\(^{\text{nd}}\) column:

\[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & 8 \\ 8 & \_ \end{bmatrix} \]

Multiply corresponding components of the highlighted row and highlighted column, then add.

Multiply corresponding components and add: \(2 \cdot 2 + 3 \cdot 3 + (-2) \cdot (-2) = 17\).

Place this number into the 2\(^{\text{nd}}\) row and 2\(^{\text{nd}}\) column of the product:

Answer: \[ \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} = \begin{bmatrix} 17 & 8 \\ 8 & 17 \end{bmatrix} \]

Again, ♡ lovingly ♡ transcribed from Wolfram ♡

The characteristic polynomial is \(\det(A A^{T}-\lambda I)=\lambda^{2}-34 \lambda+225= \) \( (\lambda-25)(\lambda-9)\), so the singular values are \[\sigma_{1}=\sqrt{25}=5\] \[\sigma_{2}=\sqrt{9}=3\] These singular values are important to finding \(U\) and \(V^T\), but let's pause to think about that last step.

For this sixe theorems page, I do not want to get too much in to determinates (i.e. \(\det(A)\)). Besides, the cross product we did for the Fundamental Theorem already demonstrated the \(2\times2\) determinate. Recall, the determinant of the matrix \(\begin{bmatrix} a & b \\ c & d \end{bmatrix}\) is given by \(ad - bc\).

The beauty and horror of determinats aside, quickly recall the identity matrix, which is just like the number \(1\) but for matrices.

The general identity matrix, denoted as \(I_n\), is a square matrix of size \(n \times n\) with ones on the main diagonal (from the top-left to the bottom-right) and zeros elsewhere.

The general identity matrix \(I_n\) can be represented as:

\[ I_n = \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix} \]

In this matrix, the element at the \(i^\text{th}\) row and \(j^\text{th}\) column is denoted as \((I_n)_{ij}\). It has the value:

\[ (I_n)_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases} \]

Now, before getting back to Prof. Marshall Hampton is to recall that matrix multiplication is not commutatitive. Generally, \(AB \ne BA \)

Ok, here's another small explanation, but this isn't a mere note. It is important! The original from Prof. Marshall Hampton also does not mention \(S = \text{diag}_{n\times p} (\sigma) \) meaning, the singualar value matrix is just a diagonal where \(n\times p\) are just the row and column lengths for the original matrix.

Thus, \( \begin{bmatrix}\sigma_1 & 0 & 0 \\ 0 & \sigma_2 & 0\end{bmatrix} = \begin{bmatrix}3 & 0 & 0 \\ 0 & 5 & 0\end{bmatrix} \)

Now we find the right singular vectors (the columns of \(V\)) by finding an orthonormal set of eigenvectors of \(A^{T} A\). It is also possible to proceed by finding the left singular vectors (columns of \(U\)) instead. The eigenvalues of \(A^{T} A\) are 25, 9, and 0, and since \(A^{T} A\) is symmetric, we know that the eigenvectors will be orthogonal.

Multiply the following matrices: \[ \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} \]

Determine the dimension of the product. The dimensions of the first matrix are 3x2 and the dimensions of the second matrix are 2x3. This means the dimensions of the product are 3x3: \[ \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} = \begin{bmatrix} \_ & \_ & \_ \\ \_ & \_ & \_ \\ \_ & \_ & \_ \end{bmatrix} \]

Find the entry in the \(1^\text{st}\) row and \(1^\text{st}\) column of the product matrix. First look at the \(1^\text{st}\) row of the first matrix and the \(1^\text{st}\) column of the second matrix.

Highlight the \(1^\text{st}\) row and the \(1^\text{st}\) column:

\[ \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} = \begin{bmatrix} \_ & \_ & \_ \\ \_ & \_ & \_ \\ \_ & \_ & \_ \end{bmatrix} \]

Multiply corresponding components of the highlighted row and highlighted column, then add.

Multiply corresponding components and add: \(3 \cdot 3 + 2 \cdot 2 = 13\).

Place this number into the \(1^\text{st}\) row and \(1^\text{st}\) column of the product:

\[ \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} = \begin{bmatrix} 13 & \_ & \_ \\ \_ & \_ & \_ \\ \_ & \_ & \_ \end{bmatrix} \]

Repeat the same process to find the remaining entries of the product matrix:

\[ \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} = \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \]

Therefore, the product of the matrices is:

\[A^{T} A = \begin{bmatrix} 3 & 2 \\ 2 & 3 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} 3 & 2 & 2 \\ 2 & 3 & -2 \end{bmatrix} = \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \]

Yes, Wolfram :)

For \(\lambda=25\), we have

\[ A^{T} A-25 I\] \[=\begin{bmatrix}-12 & 12 & 2 \\ 12 & -12 & -2 \\ 2 & -2 & -17\end{bmatrix} \]

This row-reduces to the matrix \(\begin{bmatrix}1 & -1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0\end{bmatrix}\) giving the unit-length vector in the kernel of that matrix of \(v_{1}=\begin{bmatrix}\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ 0\end{bmatrix}\)

Convert \(A^{T} A-25 I\) to reduced row echelon form:

\[ \begin{bmatrix} -12 & 12 & 2 \\ 12 & -12 & -2 \\ 2 & -2 & -17 \end{bmatrix} \]

Add one row to another:

\[ \begin{bmatrix} -12 & 12 & 2 \\ 0 & 0 & 0 \\ 2 & -2 & -17 \end{bmatrix} \]

Add a multiple of one row to another:

\[ \begin{bmatrix} -12 & 12 & 2 \\ 0 & 0 & 0 \\ 0 & 0 & -\frac{50}{3} \end{bmatrix} \]

Swap two rows:

\[ \begin{bmatrix} -12 & 12 & 2 \\ 0 & 0 & -\frac{50}{3} \\ 0 & 0 & 0 \end{bmatrix} \]

Multiply row 2 by a scalar:

\[ \begin{bmatrix} -12 & 12 & 2 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Subtract a multiple of one row from another:

\[ \begin{bmatrix} -12 & 12 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Divide row 1 by a scalar:

\[ \begin{bmatrix} 1 & -1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Verify matrix is reduced:

This matrix is now in reduced row echelon form. All nonzero rows are above rows of all zeros:

\[ \begin{bmatrix} 1 & -1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Verify pivots and their positions:

Each pivot is 1 and is strictly to the right of every pivot above it:

\[ \begin{bmatrix} 1 & -1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Verify all non-pivot elements in pivot columns are zeros:

Each pivot is the only nonzero entry in its column:

Answer:

\[ \begin{bmatrix} 1 & -1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix} \]

Yes, I transcribed from Wolfram ♡ ♡

For \(\lambda=9\), we have \(A^{T} A-9 I=\begin{bmatrix}4 & 12 & 2 \\ 12 & 4 & -2 \\ 2 & -2 & -1\end{bmatrix}\) which row-reduces to \(\begin{bmatrix}1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0\end{bmatrix}\).

A unit-length vector in the kernel is \(v_{2}=\begin{bmatrix}\frac{1}{\sqrt{18}} \\ -\frac{1}{\sqrt{18}} \\ \frac{4}{\sqrt{18}}\end{bmatrix}\).

Find the kernel of the matrix M:

\( M = \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \)

The kernel of a matrix \( M \) is the set of solutions \( v \) to the homogeneous equation \( M \cdot v = 0 \).

The kernel of matrix \( M = \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \) is the set of all vectors \( v = (x_1, x_2, x_3) \) such that \( M \cdot v = 0 \):

\( \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \cdot (x_1, x_2, x_3) = (0, 0, 0) \)

Identify free variables. Free variables in the kernel \((x_1, x_2, x_3)\) correspond to the columns in \( \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \) which have no pivot.

Column 3 is the only column with no pivot, so we may take \( x_3 \) to be the only free variable.

Perform matrix multiplication. Multiply out the reduced matrix \( \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \) with the proposed solution vector \( (x_1, x_2, x_3) \):

\( \begin{bmatrix} 1 & 0 & -\frac{1}{4} \\ 0 & 1 & \frac{1}{4} \\ 0 & 0 & 0 \end{bmatrix} \cdot (x_1, x_2, x_3) = (x_1 - \frac{x_3}{4}, x_2 + \frac{x_3}{4}, 0) = (0, 0, 0) \)

Convert to a system and solve in terms of the free variables. Solve the equations \( \begin{cases} x_1 - \frac{x_3}{4} = 0 \\ x_2 + \frac{x_3}{4} = 0 \\ 0 = 0 \end{cases} \) for \( x_1 \) and \( x_2 \):

\( \begin{cases} x_1 = \frac{x_3}{4} \\ x_2 = -\frac{x_3}{4} \end{cases} \)

Replace the pivot variables with free variable expressions. Rewrite \( v \) in terms of the free variable \( x_3 \), and assign it an arbitrary real value of \( x \):

\( v = \left(\frac{x_3}{4}, -\frac{x_3}{4}, x_3\right) = \left(\frac{x}{4}, -\frac{x}{4}, x\right) \) for \( x \in \mathbb{R} \)

Rewrite the solution vector without using fractions. Since \( x \) is taken from \( \mathbb{R} \), we can replace it with \( 4x \):

\( \left(\frac{x}{4}, -\frac{x}{4}, x\right) \rightarrow \left(\frac{4x}{4}, -\frac{1}{4}(4x), 4x\right) = (x, -x, 4x) \) for \( x \in \mathbb{R} \)

Convert to set-builder notation. Rewrite the solution vector \( v = (x, -x, 4x) \) in set notation:

Answer: \( \left\{ (x, -x, 4x) : x \in \mathbb{R} \right\} \)

Yes, yes... ♡ ♡ Wolfram ♡

For the last eigenvector, we could (option 1) compute the kernel of \(A^{T} A\) or (option 2) find a unit vector perpendicular to \(v_{1}\) and \(v_{2}\).

Prof. Marshall Hampton only uses option 2, which requires more reasoning and abstraction but is far less computational. Still, I like option 1 too.

Option 1: \( \quad \text{ker}(A^{T}A)\)

Let \(M\) denote \(A^{T}A = \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix}\)

\( M = \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \)

The kernel of a matrix \( M \) is the set of solutions \( v \) to the homogeneous equation \( M \cdot v = 0 \).

The kernel of matrix \( M = \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \) is the set of all vectors \( v = (x_1, x_2, x_3) \) such that \( M \cdot v = 0 \):

\( \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \cdot (x_1, x_2, x_3) = (0, 0, 0) \)

The kernel of a matrix is equal to the kernel of the row echelon form of the matrix.

Reduce the matrix \( \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \) to row echelon form:

\( \begin{bmatrix} 13 & 12 & 2 \\ 12 & 13 & -2 \\ 2 & -2 & 8 \end{bmatrix} \)

Subtract a multiple of one row from another. Subtract \( \frac{12}{13} \times \) (row 1) from row 2:

\( \begin{bmatrix} 13 & 12 & 2 \\ 0 & \frac{25}{13} & -\frac{50}{13} \\ 2 & -2 & 8 \end{bmatrix} \)

Subtract a multiple of one row from another. Subtract \( \frac{2}{13} \times \) (row 1) from row 3:

\( \begin{bmatrix} 13 & 12 & 2 \\ 0 & \frac{25}{13} & -\frac{50}{13} \\ 0 & -\frac{50}{13} & \frac{100}{13} \end{bmatrix} \)

Swap two rows. Swap row 2 with row 3:

\( \begin{bmatrix} 13 & 12 & 2 \\ 0 & -\frac{50}{13} & \frac{100}{13} \\ 0 & \frac{25}{13} & -\frac{50}{13} \end{bmatrix} \)

Add a multiple of one row to another. Add \( \frac{1}{2} \times \) (row 2) to row 3:

\( \begin{bmatrix} 13 & 12 & 2 \\ 0 & -\frac{50}{13} & \frac{100}{13} \\ 0 & 0 & 0 \end{bmatrix} \)

Multiply row 2 by a scalar. Multiply row 2 by \( -\frac{13}{50} \):

\( \begin{bmatrix} 13 & 12 & 2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \)

Subtract a multiple of one row from another. Subtract \( 12 \times \) (row 2) from row 1:

\( \begin{bmatrix} 13 & 0 & 26 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \)

Divide row 1 by a scalar. Divide row 1 by 13:

\( \begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \)

Identify free variables. Free variables in the kernel \( (x_1, x_2, x_3) \) correspond to the columns in \( \begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \) which have no pivot. Column 3 is the only column with no pivot, so we may take \( x_3 \) to be the only free variable.

Perform matrix multiplication. Multiply out the reduced matrix \( \begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \) with the proposed solution vector \( (x_1, x_2, x_3) \):

\( \begin{bmatrix} 1 & 0 & 2 \\ 0 & 1 & -2 \\ 0 & 0 & 0 \end{bmatrix} \cdot (x_1, x_2, x_3) = (x_1 + 2x_3, x_2 - 2x_3, 0) = (0, 0, 0) \)

Convert to a system and solve in terms of the free variables. Solve the equations \( \{ x_1 + 2x_3 = 0, x_2 - 2x_3 = 0, 0 = 0 \} \) for \( x_1 \) and \( x_2 \):

\( \{ x_1 = -2x_3, x_2 = 2x_3 \} \)

Replace the pivot variables with free variable expressions. Rewrite \( v \) in terms of the free variable \( x_3 \), and assign it an arbitrary real value of \( x \):

\( v = (x_1, x_2, x_3) = (-2x_3, 2x_3, x_3) = (-2x, 2x, x) \) for \( x \in \mathbb{R} \)

Convert to set builder notation. Rewrite the solution vector \( v = (-2x, 2x, x) \) in set notation:

Answer: \( \{ (-2x, 2x, x) : x \in \mathbb{R} \} \)

Wolfram, Wolfram, etc ♡

Option 2: To be perpendicular to \(v_{1}=\begin{bmatrix}a & b & c\end{bmatrix}\), we need \(-a=b\). Then the condition that \(v_{2}^{T} v_{3}=0\) becomes \(\frac{2a}{\sqrt{18}}+\frac{4c}{\sqrt{18}}=0\) or \(-a=2c\).

Thus, \(v_{3}=\begin{bmatrix}a & -a & -\frac{a}{2}\end{bmatrix}\) — which satsfies the option 1 result factoring out \(-2\) from \( \ker (A)=\) \( \{ (-2x, 2x, x) : x \in \mathbb{R} \} \) — and for it to be unit-length, we need \(a=\frac{2}{3}\) which gives \(v_{3}=\begin{bmatrix}\frac{2}{3} & -\frac{2}{3} & -\frac{1}{3}\end{bmatrix}\).

At this point, we know that

\[ A=U S V^{T}=U\begin{bmatrix}5 & 0 & 0 \\ 0 & 3 & 0\end{bmatrix}\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 \\ \frac{1}{\sqrt{18}} & -\frac{1}{\sqrt{18}} & \frac{4}{\sqrt{18}} \\ \frac{2}{3} & -\frac{2}{3} & -\frac{1}{3}\end{bmatrix} \]

Finally, we can compute \(U\) (by the product of the eigenvectors \(v_i\) and the original matrix \(A\) and also the reciprocals of the singular values\(\frac{1}{\sigma_i}\)) all via the formula \(\sigma u_{i}=A v_{i}\) or \(u_{i}=\frac{1}{\sigma} A v_{i}\). This gives \(U=\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\end{bmatrix}\).

As Prof. Marshall Hampton puts it: "So, in its full glory, the SVD is:"

\[ A=U S V^{T}=\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\end{bmatrix}\begin{bmatrix}5 & 0 & 0 \\ 0 & 3 & 0\end{bmatrix}\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0 \\ \frac{1}{\sqrt{18}} & -\frac{1}{\sqrt{18}} & \frac{4}{\sqrt{18}} \\ \frac{2}{3} & -\frac{2}{3} & -\frac{1}{3}\end{bmatrix} \]


Spectral Theorem

If \( A^T = A \), there are orthonormal \( q \)'s so that \( Aq_i = \lambda_iq_i \) and \( A = Q\Lambda Q^T \).

Mathematically: If \( A \) is symmetric, there exist orthonormal eigenvectors (\( q \)'s) and eigenvalues (\( \lambda \)'s) such that \( Aq_i = \lambda_iq_i \).

Now we're really talking.

Here is another explanation, this time explaining a lovely example from Prof. Bruce Ikenaga transcribed from here.

Also, here is a shorter proof from Prof. Brad Rodgers at an undergraduate level, and here is a longer proof from Prof. Dana Williams at a graduate level applying this theorem to bounded and unbounded operators on an infinite dimensional complex Hilbert space, all culminating in Stone's Theorem.

Finally, Prof. Ikenaga's example requires little explanation or calculation, but there are two points to touch on first.

First, each column has unit length and is perpendicular to every other column, so the inverse is the transpose. "If \(A^{-1}=A^T\), then \(A^TA=I\). This means that each column has unit length and is perpendicular to every other column. That means it is an orthonormal matrix. [...] Think of \(A\) as an arrangement of \(n\) columns (each \(n\) elements tall). Then the \((i, j)\) element of \(A^TA\) is the dot product of the \(i\)th and \(j\)th columns of \(A\) since the \(i\)th row of \(A^T\) is the \(i\)th column of \(A\)." (Math Stack Exchange)

Second, eigenvectors from different eigenvalues are linearly independent.

The best explanation I could find of this is from Math Stack Exchange:

Suppose \(\mathbf{v}_1\) and \(\mathbf{v}_2\) correspond to distinct eigenvalues \(\lambda_1\) and \(\lambda_2\), respectively. Take a linear combination that is equal to \(0\), \(\alpha_1\mathbf{v}_1+\alpha_2\mathbf{v}_2 = \mathbf{0}\). We need to show that \(\alpha_1=\alpha_2=0\). Applying \(T\) to both sides, we get \[\mathbf{0} = T(\mathbf{0}) = T(\alpha_1\mathbf{v}_1+\alpha_2\mathbf{v}_2) = \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_2\mathbf{v}_2.\] Now, instead, multiply the original equation by \(\lambda_1\): \[\mathbf{0} = \lambda_1\alpha_1\mathbf{v}_1 + \lambda_1\alpha_2\mathbf{v}_2.\] Now take the two equations, \[\begin{align*} \mathbf{0} &= \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_2\mathbf{v}_2\\ \mathbf{0} &= \alpha_1\lambda_1\mathbf{v}_1 + \alpha_2\lambda_1\mathbf{v}_2 \end{align*}\] and taking the difference, we get: \[\mathbf{0} = 0\mathbf{v}_1 + \alpha_2(\lambda_2-\lambda_1)\mathbf{v}_2 = \alpha_2(\lambda_2-\lambda_1)\mathbf{v}_2.\] Since \(\lambda_2-\lambda_1\neq 0\), and since \(\mathbf{v}_2\neq\mathbf{0}\) (because \(\mathbf{v}_2\) is an eigenvector), then \(\alpha_2=0\). Using this on the original linear combination \(\mathbf{0} = \alpha_1\mathbf{v}_1 + \alpha_2\mathbf{v}_2\), we conclude that \(\alpha_1=0\) as well (since \(\mathbf{v}_1\neq\mathbf{0}\)). So \(\mathbf{v}_1\) and \(\mathbf{v}_2\) are linearly independent. Now try using induction on \(n\) for the general case.

Now, take a symmetric matrix \( A=\left[\begin{array}{ccc} -1 & 2 & 0 \\ 2 & 2 & 0 \\ 0 & 0 & 3 \end{array}\right] \)

Find an orthogonal matrix \(\mathrm{O}\) which diagonalizes \(\mathrm{A}\). Find \(\mathrm{O}^{-1}\) and the corresponding diagonal matrix.

The characteristic polynomial is

\[ \det{(A - xI)} \] \[ = (x-3)[(x-2)(x+1)-(2)(2)]\] \[=-(x-3)^2(x+2)\]

The eigenvalues are \(x=3\) and \(x=-2\).

For \(x=3\), the eigenvector matrix is

\[ A-3 I=\left[\begin{array}{ccc} -4 & 2 & 0 \\ 2 & -1 & 0 \\ 0 & 0 & 0 \end{array}\right] \rightarrow\left[\begin{array}{ccc} 2 & -1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}\right] \]

This gives the independent eigenvectors \((1,2,0)\) and \((0,0,1)\). Dividing them by their lengths, I get \(\frac{1}{\sqrt{5}}(1,2,0)\) and \((0,0,1)\).

For \(x=-2\), the eigenvector matrix is

\[ A+2 I=\left[\begin{array}{lll} 1 & 2 & 0 \\ 2 & 4 & 0 \\ 0 & 0 & 5 \end{array}\right] \rightarrow\left[\begin{array}{lll} 1 & 2 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}\right] \]

This gives the independent eigenvector \((-2,1,0)\). Dividing it by its length, I get \(\frac{1}{\sqrt{5}}(-2,1,0)\).

Thus, the orthogonal diagonalizing matrix is

\[ O=\left[\begin{array}{ccc} \frac{1}{\sqrt{5}} & 0 & -\frac{2}{\sqrt{5}} \\ \frac{2}{\sqrt{5}} & 0 & \frac{1}{\sqrt{5}} \\ 0 & 1 & 0 \end{array}\right] \]

Then (note again: each column has unit length and is perpendicular to every other column, so the inverse is the transpose)

\[ O^{-1}=O^T=\left[\begin{array}{ccc} \frac{1}{\sqrt{5}} & \frac{2}{\sqrt{5}} & 0 \\ 0 & 0 & 1 \\ -\frac{2}{\sqrt{5}} & \frac{1}{\sqrt{5}} & 0 \end{array}\right] \]

The diagonal matrix is

\[ O^T A O=\left[\begin{array}{ccc} \frac{1}{\sqrt{5}} & \frac{2}{\sqrt{5}} & 0 \\ 0 & 0 & 1 \\ -\frac{2}{\sqrt{5}} & \frac{1}{\sqrt{5}} & 0 \end{array}\right] \left[\begin{array}{ccc} -1 & 2 & 0 \\ 2 & 2 & 0 \\ 0 & 0 & 3 \end{array}\right] \left[\begin{array}{ccc} \frac{1}{\sqrt{5}} & 0 & -\frac{2}{\sqrt{5}} \\ \frac{2}{\sqrt{5}} & 0 & \frac{1}{\sqrt{5}} \\ 0 & 1 & 0 \end{array}\right] \] \[ =\left[\begin{array}{ccc} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & -2 \end{array}\right] \]

We all ♡ Wolfram


Linear Algebra in a Nutshell

As a bonus, here is the other part of the sections from from Gilbert Strang's Introduction to Linear Algebra, 5th Ed. I was tempted not to include this because I cannot think of a sufficient example. But, I was struck by the similarity with "The Key Theorem of Linear Algebra" from Prof. Thomas Garrity's All the Math you Missed, 2nd Ed. Prof. Garrity lays out his version after giving three definitions for the determinant (basically from induction, from linear rules, & from signed volume), but Prof. Garrity does so before defining eigenvectors, as his intro to matrices as linear transformations.

Let \(A\) be a \(n\times n\) matrix...

SingularNonsingular
\(A\) is not invertible\(A\) is invertible
The columns are dependentThe columns are independent
The rows are dependentThe rows are independent
The determinant is zeroThe determinant is not zero
\(Ax=0\) has infinitely many solutions\(Ax=0\) has one solution \(x=0\)
\(Ax=b\) has no solution or \(\infty\) many\(Ax=b\) has one solution \(x=A^{-1}b\)
\(A\) has \(r < n\) pivots\(A\) has \(n\) (nonzero) pivots
\(A\) has rank \(r < n\)\(A\) has full rank \(r=n\)
Reduced row echelon form isn't \(R=I\)Reduced row echelon form is \(R=I\)
The column space has dimension \(r < n\)The column space is all of \(R^n\)
The row space has dimension \(r < n\)The row space is all of \(R^n\)
Zero is an eigenvalue of \(A\)All eigenvalues are nonzero
\(A^TA\) is only semidefinite\(A^TA\) is symmetric positive definite
\(A\) has \(r < n\) singular values\(A\) has \(n\) (positive) singular values

Now ♡ transcribed ♡ from All the Math you Missed by Prof. Garrity ♡ ♡

Theorem 1.6.1 (Key Theorem)

Let \(A\) be an \(n \times n\) matrix. Then the following are equivalent:

  1. \(A\) is invertible.
  2. \(\det(A) \neq 0\).
  3. \(\text{ker}(A) = \{0\}\).
  4. If \(b\) is a column vector in \(\mathbb{R}^n\), there is a unique column vector \(x\) in \(\mathbb{R}^n\) satisfying \(Ax = b\).
  5. The columns of \(A\) are linearly independent \(n \times 1\) column vectors.
  6. The rows of \(A\) are linearly independent \(1 \times n\) row vectors.
  7. The transpose \(A^T\) of \(A\) is invertible. (Here, if \(A = (a_{ij})\), then \(A^T = (a_{ji})\)).
  8. All of the eigenvalues of \(A\) are non-zero.

Theorem 1.6.2 (Key Theorem)

Let \(T : V \rightarrow V\) be a linear transformation. Then the following are equivalent:

  1. \(T\) is invertible.
  2. \(\det(T) \neq 0\), where the determinant is defined by a choice of basis on \(V\).
  3. \(\text{ker}(T) = \{0\}\).
  4. If \(b\) is a vector in \(V\), there is a unique vector \(v\) in \(V\) satisfying \(T(v) = b\).
  5. For any basis \(v_1, \ldots, v_n\) of \(V\), the image vectors \(T(v_1), \ldots, T(v_n)\) are linearly independent.
  6. For any basis \(v_1, \ldots, v_n\) of \(V\), if \(S\) denotes the transpose linear transformation of \(T\), then the image vectors \(S(v_1), \ldots, S(v_n)\) are linearly independent.
  7. The transpose of \(T\) is invertible. (Here the transpose is defined by a choice of basis on \(V\).)
  8. All of the eigenvalues of \(T\) are non-zero.


I hope you enjoyed because I sure did.