Singular Value Decomposition of a Matrix 📂Matrix Algebra

Singular Value Decomposition of a Matrix

Overview

While it would be great if every matrix could be decomposed through eigenvalue diagonalization, unfortunately, this method is limited by the requirement that the given matrix must be a square matrix. We aim to extend decomposability to matrices that are not square.

Buildup

Let us consider two natural numbers $m > n$ for a matrix $A \in \mathbb{C}^{ m \times n}$ whose coefficients are given by $\text{rank} A = n$ . Then, it is possible to think about $\dim C(A) = \dim C(A^{T}) = n$ , its orthonormal vectors $\mathbf{v}_{1} , \cdots , \mathbf{v}_{n} \in C(A)$ and $\mathbf{u}_{1} , \cdots , \mathbf{u}_{n} \in C(A^{T})$ .

Similar to when contemplating the geometric meaning of eigenvalues, $A \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i}$ let us assume that a $\sigma_{i}>0$ exists that satisfies this. Then, $A$ is a linear transformation that aligns the direction of $\mathbf{v}_{i}$ with $\mathbf{u}_{i}$ , and $\sigma_{i}>0$ adjusts its magnitude. This $\sigma_{i}$ is defined not as an eigenvalue but as the singular value of $A$ , although, regrettably, its meaning has nothing to do with singular value decomposition (SVD). The difference from the discussion on eigenvalues here is $A \in \mathbb{C}^{ m \times n}$ , that is $\mathbf{v}_{i} \in \mathbb{C}^{n}$ , and $\mathbf{u}_{i} \in \mathbb{C}^{m}$ , which means $A$ changes the dimension of vectors too. For matrix $A$ , we have $A = \sigma_{i} \mathbf{u}_{i} \mathbf{v}_{i}^{\ast}$ and can confirm that both sides have the dimension $m \times n = ( m \times 1 ) \times (1 \times n )$ . Note that, unlike eigenvectors, there is a distinction between left and right, so $\mathbf{u}_{i}$ is defined as the left singular vector, and $\mathbf{v}_{i}$ is the right singular vector. If we now rewrite $A \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i}$ for $1 \le i \le n$ , it becomes $A \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} = \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \begin{bmatrix} \sigma_{1} & 0 & \cdots & 0 \\ 0 & \sigma_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \sigma_{m} \end{bmatrix}$ For simplicity, let’s express it as below. $V := \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} \in \mathbb{C}^{n \times n} \\ \widehat{U} := \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \in \mathbb{C}^{m \times n} \\ \widehat{\Sigma} := \text{diag} ( \sigma_1 , \sigma_2 , \cdots , \sigma_n)$ For convenience, let’s say $\sigma_{i} \ge \sigma_{j} > 0$ . Summarizing, $AV = \widehat{U} \widehat{\Sigma}$ since $V V^{\ast} = V^{\ast} V = I$ , we have $A = \widehat{U} \widehat{\Sigma} V^{\ast}$ Of course, since $\widehat{U} \in \ \mathbb{C}^{m \times n}$ , although $\widehat{U}$ is not a unitary matrix, it has orthonormality, so $\widehat{U}^{\ast} \widehat{U} = I_{n}$ , $\begin{align*} A^{\ast}A =& (\widehat{U} \widehat{\Sigma} V^{\ast})^{\ast} (\widehat{U} \widehat{\Sigma} V^{\ast}) \\ =& V \widehat{\Sigma} \widehat{U}^{\ast} \widehat{U} \widehat{\Sigma} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{-1} \end{align*}$

$A = S \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \lambda_{m} \end{bmatrix} S^{-1}$

This can be seen as the result of completing eigenvalue diagonalization for the square matrix $A^{\ast} A \in \mathbb{C}^{m \times m}$ . $\sigma^{2}_{i}$ is the eigenvalue of $A^{\ast}A$ , and $\mathbf{v}_{i}$ becomes the eigenvector corresponding to $\sigma^{2}_{i}$ . Now, for numerical calculations, let’s trace back through the process in reverse.

Algorithm

For the matrix $A \in \mathbb{C}^{m \times n} (m \ge n)$ , it is given that $\text{rank} A = n$ .

Step 1.

Find the eigenvalues $\sigma_{1} \ge \sigma_{2} \ge \cdots \ge \sigma_{n}>0$ of $A^{\ast} A$ . For any $\mathbf{x} \ne \mathbb{0}$ , $\mathbf{x}^{\ast} A^{\ast} A \mathbf{x} = || A \mathbf{x} ||^2 > 0$ meaning, since $A^{\ast} A$ is positive definite, $\sigma_{i} >0$ is guaranteed.

Step 2.

Find orthonormal vectors $V = \begin{bmatrix} v_{1} & v_{2} & \cdots & v_{n} \end{bmatrix}$ corresponding to the found eigenvalues.

Step 3.

Through $\displaystyle \mathbf{u}_{i} = {{1} \over { \sigma_{i} }} A \mathbf{v}_{i}$ , find $\widehat{U} = \begin{bmatrix} u_{1} & u_{2} & \cdots & u_{n} \end{bmatrix}$ .

Reduced Singular Value Decomposition and Full Singular Value Decomposition

Decomposing $A$ into three matrices satisfying $A = \widehat{U} \widehat{\Sigma} V^{\ast}$ is known as reduced singular value decomposition (rSVD).

In order to distinguish this from full singular value decomposition (fSVD), notations like $\widehat{U}$ and $\widehat{\Sigma}$ were used. The concept behind full SVD is merely an expanded version of the reduced SVD. It involves finding $\mathbf{u}_{n+1} , \cdots , \mathbf{u}_{m}$ that turns matrix $U := \begin{bmatrix} \widehat{U} & \mathbf{u}_{n+1} & \cdots & \mathbf{u}_{m} \end{bmatrix} \in \mathbb{C}^{m \times m}$ into a unitary matrix and then appending $0$ to the bottom of $\widehat{\Sigma}$ to construct the matrix $\Sigma := \begin{bmatrix} \widehat{\Sigma} \\ O \end{bmatrix}$ which is decomposed into form $A = U \Sigma V^{\ast}$ .

fSVD is theoretically more robust, and rSVD is more advantageous for actual computations. Whether it is fSVD or rSVD, being told to perform SVD essentially means to find $U$ or $\widehat{U}$ , $\Sigma$ or $\widehat{\Sigma}$ , $V$ .