Proof of the Existence of a Complete Singular Value Decomposition 📂Matrix Algebra

Proof of the Existence of a Complete Singular Value Decomposition

Overview

Eigenvalue diagonalization was limited to square matrices in application, but Singular Value Decomposition (SVD) had no such constraint.

Determining whether such useful decomposition methods universally apply to all matrices, i.e., the existence of decomposition, is considered a significantly important issue.

Theorem

For three natural numbers $m \ge n \ge r = \text{rank} A$ , the matrix $A \in \mathbb{R}^{m \times n}$ possesses an SVD.

Proof

For any vector $\mathbf{x} \ne \mathbb{0}$ , since $\mathbf{x}^{T} A^{T} A \mathbf{x} = || A \mathbf{x} || ^2 > 0$ , eigenvalue $\sigma_{i}^{2}$ of $A^{T} A$ is positive for $i \le r$ and $0$ for $i > r$ . For convenience, $\sigma_{1} \ge \sigma_{2} \ge \cdots \ge \sigma_{r}>0 = \sigma_{r+1} = \cdots = \sigma_{n}$ define diagonal matrix $S := \text{diag} ( \sigma_{1} , \sigma_{2} , \cdots , \sigma_{r})$ . If $\mathbf{v}_{1} , \mathbf{v}_{2} , \cdots , \mathbf{v}_{n}$ , which are mutually orthonormal, correspond to $\sigma_{1} , \sigma_{2} , \cdots , \sigma_{n}$ , then $A^{T} A \mathbf{v}_{i} = \sigma_{i}^2 \mathbf{v}_{i}$ Now $V_1 := \begin{bmatrix} \mathbf{v}_{1} & \mathbf{v}_{2} & \cdots & \mathbf{v}_{r} \end{bmatrix} \\ V_2 := \begin{bmatrix} \mathbf{v}_{r+1} & \mathbf{v}_{r+2} & \cdots & \mathbf{v}_{n} \end{bmatrix}$ constructing $V := \begin{bmatrix} V_{1} & V_{2} \end{bmatrix}$ results in orthogonal matrix. For $V_{1}$ , since $A^{T} A V_{1} = V_{1} S^2$ and multiplying both sides by $V_{1}^{T}$ yields $V_{1}^{T} A^{T} A V_{1} = V_{1}^{T} V_{1} S^2 = S^2$ Here, if $U_{1} : = A V_{1} S^{-1}$ is defined as $U_{1}^{T} A V_{1} = ( A V_{1} S^{-1} )^{T} A V_{1} = S^{-1} (A V_{1})^{T} A V_{1} = S^{-1} \cdot S^2 = S$ and since $\sigma_{r+1} = \cdots = \sigma_{n} = 0$ for $V_{2}$ , $A^{T} A V_{2} = V_{2} O = \mathbb{0}$ and multiplying both sides by $V_{2}^{T}$ yields $V_{2}^{T} A^{T} A V_{2} = \| A V_{2} \|^2 = 0$

Now $U : = \begin{bmatrix} U_{1} & U_{2} \end{bmatrix}$ defining $U_{2}$ to make it an orthogonal matrix results in $U_{2}^{T} U_{1} = 0$ , and multiplying both sides of $AV_{1} = U_{1} S$ by $U_{2}^{T}$ yields $U_{2}^{T} AV_{1} = U_{2}^{T} U_{1} S = 0$ Meanwhile $U_{1}^{T} U_{1} = ( A V_{1} S^{-1} ) ^{T} ( A V_{1} S^{-1} ) = S V_{1}^{T} A^{T} A V S^{-1} = I$ implies that $U_{1} \in \mathbb{R}^{m \times r}$ consists of orthonormal vectors. However, it does not prove that $U_{1}$ is an orthogonal matrix, so it’s unknown if $U_{1} U_{1}^{T} = I$ , and the necessity for verifying $A V_{1} = U_{1} S$ is as follows. $\begin{align*} && U^{T} A V_{1} = \begin{bmatrix} U_{1}^{T} \\ U_{2}^{T} \end{bmatrix} A V_{1} = \begin{bmatrix} U_{1}^{T} A V_{1} \\ U_{2}^{T} A V_{1} \end{bmatrix} = \begin{bmatrix} S \\ O \end{bmatrix} \\ \iff & AV_{1} = U \begin{bmatrix} S \\ O \end{bmatrix} = \begin{bmatrix} U_{1} & U_{2} \end{bmatrix} \begin{bmatrix} S \\ O \end{bmatrix} = U_{1} S \end{align*}$

Finally, defining $\Sigma : = \begin{bmatrix} S & O \\ O & O \end{bmatrix} \in \mathbb{R}^{m \times m}$ $\begin{align*} AV =& A \begin{bmatrix} V_{1} & V_{2} \end{bmatrix} \\ =& \begin{bmatrix} A V_{1} & A V_{2} \end{bmatrix} \\ =& \begin{bmatrix} U_{1} S & O \end{bmatrix} \\ =& \begin{bmatrix} U_{1} & U_{2} \end{bmatrix} \begin{bmatrix} S & O \\ O & O \end{bmatrix} \\ =& U \Sigma \end{align*}$ thus obtains $A = U \Sigma V^{T} = U_{1} S V_{1}^{T}$ , i.e., an SVD.

■

Generalization

Even though the conditions of the theorem are $A \in \mathbb{R}^{m \times n}$ , in fact, it is possible to generalize to $A \in \mathbb{C}^{m \times n}$ .