logo

Singular Value Decomposition of a Matrix 📂Matrix Algebra

Singular Value Decomposition of a Matrix

Overview

While it would be great if every matrix could be decomposed through eigenvalue diagonalization, unfortunately, this method is limited by the requirement that the given matrix must be a square matrix. We aim to extend decomposability to matrices that are not square.

Buildup

Let us consider two natural numbers m>nm > n for a matrix ACm×nA \in \mathbb{C}^{ m \times n} whose coefficients are given by rankA=n\text{rank} A = n. Then, it is possible to think about dimC(A)=dimC(AT)=n\dim C(A) = \dim C(A^{T}) = n , its orthonormal vectors v1,,vnC(A)\mathbf{v}_{1} , \cdots , \mathbf{v}_{n} \in C(A) and u1,,unC(AT)\mathbf{u}_{1} , \cdots , \mathbf{u}_{n} \in C(A^{T}).

Similar to when contemplating the geometric meaning of eigenvalues, Avi=σiui A \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i} let us assume that a σi>0\sigma_{i}>0 exists that satisfies this. Then, AA is a linear transformation that aligns the direction of vi\mathbf{v}_{i} with ui\mathbf{u}_{i}, and σi>0\sigma_{i}>0 adjusts its magnitude. This σi\sigma_{i} is defined not as an eigenvalue but as the singular value of AA, although, regrettably, its meaning has nothing to do with singular value decomposition (SVD). The difference from the discussion on eigenvalues here is ACm×nA \in \mathbb{C}^{ m \times n}, that is viCn\mathbf{v}_{i} \in \mathbb{C}^{n}, and uiCm\mathbf{u}_{i} \in \mathbb{C}^{m}, which means AA changes the dimension of vectors too. For matrix AA, we have A=σiuiviA = \sigma_{i} \mathbf{u}_{i} \mathbf{v}_{i}^{\ast} and can confirm that both sides have the dimension m×n=(m×1)×(1×n)m \times n = ( m \times 1 ) \times (1 \times n ). Note that, unlike eigenvectors, there is a distinction between left and right, so ui\mathbf{u}_{i} is defined as the left singular vector, and vi\mathbf{v}_{i} is the right singular vector. If we now rewrite Avi=σiuiA \mathbf{v}_{i} = \sigma_{i} \mathbf{u}_{i} for 1in1 \le i \le n, it becomes A[v1v2vn]=[u1u2un][σ1000σ2000σm] A \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} = \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \begin{bmatrix} \sigma_{1} & 0 & \cdots & 0 \\ 0 & \sigma_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \sigma_{m} \end{bmatrix} For simplicity, let’s express it as below. V:=[v1v2vn]Cn×nU^:=[u1u2un]Cm×nΣ^:=diag(σ1,σ2,,σn) V := \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{n} \end{bmatrix} \in \mathbb{C}^{n \times n} \\ \widehat{U} := \begin{bmatrix} \mathbf{u}_{1} & \mathbf{u}_{2} & \cdots & \mathbf{u}_{n} \end{bmatrix} \in \mathbb{C}^{m \times n} \\ \widehat{\Sigma} := \text{diag} ( \sigma_1 , \sigma_2 , \cdots , \sigma_n) For convenience, let’s say σiσj>0\sigma_{i} \ge \sigma_{j} > 0. Summarizing, AV=U^Σ^ AV = \widehat{U} \widehat{\Sigma} since VV=VV=IV V^{\ast} = V^{\ast} V = I, we have A=U^Σ^V A = \widehat{U} \widehat{\Sigma} V^{\ast} Of course, since U^ Cm×n\widehat{U} \in \ \mathbb{C}^{m \times n}, although U^\widehat{U} is not a unitary matrix, it has orthonormality, so U^U^=In\widehat{U}^{\ast} \widehat{U} = I_{n}, AA=(U^Σ^V)(U^Σ^V)=VΣ^U^U^Σ^V=VΣ^2V=VΣ^2V1 \begin{align*} A^{\ast}A =& (\widehat{U} \widehat{\Sigma} V^{\ast})^{\ast} (\widehat{U} \widehat{\Sigma} V^{\ast}) \\ =& V \widehat{\Sigma} \widehat{U}^{\ast} \widehat{U} \widehat{\Sigma} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{\ast} \\ =& V \widehat{\Sigma}^{2} V^{-1} \end{align*}

A=S[λ1000λ2000λm]S1 A = S \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \lambda_{m} \end{bmatrix} S^{-1}

This can be seen as the result of completing eigenvalue diagonalization for the square matrix AACm×mA^{\ast} A \in \mathbb{C}^{m \times m}. σi2\sigma^{2}_{i} is the eigenvalue of AAA^{\ast}A, and vi\mathbf{v}_{i} becomes the eigenvector corresponding to σi2\sigma^{2}_{i}. Now, for numerical calculations, let’s trace back through the process in reverse.

Algorithm

For the matrix ACm×n(mn)A \in \mathbb{C}^{m \times n} (m \ge n), it is given that rankA=n\text{rank} A = n.

Step 1.

Find the eigenvalues σ1σ2σn>0\sigma_{1} \ge \sigma_{2} \ge \cdots \ge \sigma_{n}>0 of AAA^{\ast} A. For any x0\mathbf{x} \ne \mathbb{0}, xAAx=Ax2>0 \mathbf{x}^{\ast} A^{\ast} A \mathbf{x} = || A \mathbf{x} ||^2 > 0 meaning, since AAA^{\ast} A is positive definite, σi>0\sigma_{i} >0 is guaranteed.


Step 2.

Find orthonormal vectors V=[v1v2vn]V = \begin{bmatrix} v_{1} & v_{2} & \cdots & v_{n} \end{bmatrix} corresponding to the found eigenvalues.


Step 3.

Through ui=1σiAvi\displaystyle \mathbf{u}_{i} = {{1} \over { \sigma_{i} }} A \mathbf{v}_{i}, find U^=[u1u2un]\widehat{U} = \begin{bmatrix} u_{1} & u_{2} & \cdots & u_{n} \end{bmatrix}.

Reduced Singular Value Decomposition and Full Singular Value Decomposition

Decomposing AA into three matrices satisfying A=U^Σ^VA = \widehat{U} \widehat{\Sigma} V^{\ast} is known as reduced singular value decomposition (rSVD).

In order to distinguish this from full singular value decomposition (fSVD), notations like U^\widehat{U} and Σ^\widehat{\Sigma} were used. The concept behind full SVD is merely an expanded version of the reduced SVD. It involves finding un+1,,um\mathbf{u}_{n+1} , \cdots , \mathbf{u}_{m} that turns matrix U:=[U^un+1um]Cm×m U := \begin{bmatrix} \widehat{U} & \mathbf{u}_{n+1} & \cdots & \mathbf{u}_{m} \end{bmatrix} \in \mathbb{C}^{m \times m} into a unitary matrix and then appending 00 to the bottom of Σ^\widehat{\Sigma} to construct the matrix Σ:=[Σ^O]\Sigma := \begin{bmatrix} \widehat{\Sigma} \\ O \end{bmatrix} which is decomposed into form A=UΣVA = U \Sigma V^{\ast}.

fSVD is theoretically more robust, and rSVD is more advantageous for actual computations. Whether it is fSVD or rSVD, being told to perform SVD essentially means to find UU or U^\widehat{U}, Σ\Sigma or Σ^\widehat{\Sigma}, VV.

See also