logo

Singular Value Decomposition of a Matrix 📂Matrix Algebra

Singular Value Decomposition of a Matrix

Overview

While it would be great if every matrix could be decomposed through eigenvalue diagonalization, unfortunately, this method is limited by the requirement that the given matrix must be a square matrix. We aim to extend decomposability to matrices that are not square.

Buildup

Let us consider two natural numbers $m > n$ for a matrix $A \in \mathbb{C}^{ m \times n}$ whose coefficients are given by $\text{rank} A = n$. Then, it is possible to think about $\dim C(A) = \dim C(A^{T}) = n $, its orthonormal vectors $\mathbf{v}_{1} , \cdots , \mathbf{v}_{n} \in C(A)$ and $\mathbf{u}_{1} , \cdots , \mathbf{u}_{n} \in C(A^{T})$.

Similar to when contemplating the geometric meaning of eigenvalues, $$ A \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i} $$ let us assume that a $\sigma_{i}>0$ exists that satisfies this. Then, $A$ is a linear transformation that aligns the direction of $\mathbf{v}_{i}$ with $\mathbf{u}_{i}$, and $\sigma_{i}>0$ adjusts its magnitude. This $\sigma_{i}$ is defined not as an eigenvalue but as the singular value of $A$, although, regrettably, its meaning has nothing to do with singular value decomposition (SVD). The difference from the discussion on eigenvalues here is $A \in \mathbb{C}^{ m \times n}$, that is $\mathbf{v}_{i} \in \mathbb{C}^{n}$, and $\mathbf{u}_{i} \in \mathbb{C}^{m}$, which means $A$ changes the dimension of vectors too. For matrix $A$, we have $A = \sigma_{i} \mathbf{u}_{i} \mathbf{v}_{i}^{\ast}$ and can confirm that both sides have the dimension $m \times n = ( m \times 1 ) \times (1 \times n )$. Note that, unlike eigenvectors, there is a distinction between left and right, so $\mathbf{u}_{i}$ is defined as the left singular vector, and $\mathbf{v}_{i}$ is the right singular vector. If we now rewrite $A \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i}$ for $1 \le i \le n$, it becomes $$ A \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} = \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \begin{bmatrix} \sigma_{1} & 0 & \cdots & 0 \\ 0 & \sigma_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \sigma_{m} \end{bmatrix} $$ For simplicity, let’s express it as below. $$ V := \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} \in \mathbb{C}^{n \times n} \\ \widehat{U} := \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \in \mathbb{C}^{m \times n} \\ \widehat{\Sigma} := \text{diag} ( \sigma_1 , \sigma_2 , \cdots , \sigma_n) $$ For convenience, let’s say $\sigma_{i} \ge \sigma_{j} > 0$. Summarizing, $$ AV = \widehat{U} \widehat{\Sigma} $$ since $V V^{\ast} = V^{\ast} V = I$, we have $$ A = \widehat{U} \widehat{\Sigma} V^{\ast} $$ Of course, since $\widehat{U} \in \ \mathbb{C}^{m \times n}$, although $\widehat{U}$ is not a unitary matrix, it has orthonormality, so $\widehat{U}^{\ast} \widehat{U} = I_{n}$, $$ \begin{align*} A^{\ast}A =& (\widehat{U} \widehat{\Sigma} V^{\ast})^{\ast} (\widehat{U} \widehat{\Sigma} V^{\ast}) \\ =& V \widehat{\Sigma} \widehat{U}^{\ast} \widehat{U} \widehat{\Sigma} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{-1} \end{align*} $$

$$ A = S \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \lambda_{m} \end{bmatrix} S^{-1} $$

This can be seen as the result of completing eigenvalue diagonalization for the square matrix $A^{\ast} A \in \mathbb{C}^{m \times m}$. $\sigma^{2}_{i}$ is the eigenvalue of $A^{\ast}A$, and $\mathbf{v}_{i}$ becomes the eigenvector corresponding to $\sigma^{2}_{i}$. Now, for numerical calculations, let’s trace back through the process in reverse.

Algorithm

For the matrix $A \in \mathbb{C}^{m \times n} (m \ge n)$, it is given that $\text{rank} A = n$.

Step 1.

Find the eigenvalues $\sigma_{1} \ge \sigma_{2} \ge \cdots \ge \sigma_{n}>0$ of $A^{\ast} A$. For any $\mathbf{x} \ne \mathbb{0}$, $$ \mathbf{x}^{\ast} A^{\ast} A \mathbf{x} = || A \mathbf{x} ||^2 > 0 $$ meaning, since $A^{\ast} A$ is positive definite, $\sigma_{i} >0$ is guaranteed.


Step 2.

Find orthonormal vectors $V = \begin{bmatrix} v_{1} & v_{2} & \cdots & v_{n} \end{bmatrix}$ corresponding to the found eigenvalues.


Step 3.

Through $\displaystyle \mathbf{u}_{i} = {{1} \over { \sigma_{i} }} A \mathbf{v}_{i}$, find $\widehat{U} = \begin{bmatrix} u_{1} & u_{2} & \cdots & u_{n} \end{bmatrix}$.

Reduced Singular Value Decomposition and Full Singular Value Decomposition

Decomposing $A$ into three matrices satisfying $A = \widehat{U} \widehat{\Sigma} V^{\ast}$ is known as reduced singular value decomposition (rSVD).

In order to distinguish this from full singular value decomposition (fSVD), notations like $\widehat{U}$ and $\widehat{\Sigma}$ were used. The concept behind full SVD is merely an expanded version of the reduced SVD. It involves finding $\mathbf{u}_{n+1} , \cdots , \mathbf{u}_{m}$ that turns matrix $$ U := \begin{bmatrix} \widehat{U} & \mathbf{u}_{n+1} & \cdots & \mathbf{u}_{m} \end{bmatrix} \in \mathbb{C}^{m \times m} $$ into a unitary matrix and then appending $0$ to the bottom of $\widehat{\Sigma}$ to construct the matrix $\Sigma := \begin{bmatrix} \widehat{\Sigma} \\ O \end{bmatrix}$ which is decomposed into form $A = U \Sigma V^{\ast}$.

fSVD is theoretically more robust, and rSVD is more advantageous for actual computations. Whether it is fSVD or rSVD, being told to perform SVD essentially means to find $U$ or $\widehat{U}$, $\Sigma$ or $\widehat{\Sigma}$, $V$.

See also