logo

Cholesky Decomposition for Least Squares Method 📂Matrix Algebra

Cholesky Decomposition for Least Squares Method

Algorithm

Given $A \in \mathbb{C}^{m \times n}$ and vector $\mathbf{b} \in \mathbb{C}^{m}$ such that $\text{rank} A = n$ and the least-squares solution of $A \mathbf{x} = \mathbf{b}$ is denoted as $\mathbf{x}_{\ast}$.

Step 1.

Multiply both sides of the given equation by $A^{\ast}$ to form the normal equation $A^{\ast} A \mathbf{x} = A^{\ast} \mathbf{b}$.

The solution of the normal equation is the least-squares solution of the original equation, hence, solving for $\mathbf{x}$ fulfills our requirement.


Step 2.

Calculate $\mathbf{y} := A^{\ast} \mathbf{b}$ to obtain $A^{\ast} A \mathbf{x} = \mathbf{y}$.


Step 3. Cholesky Decomposition

Find a lower triangular matrix $L$ that satisfies $A^{\ast} A = L L^{\ast}$ to derive $L L^{\ast} \mathbf{x} = \mathbf{y}$.


Step 4. Forward Substitution

Given $L L^{\ast} \mathbf{x} = \mathbf{y}$, assume $L^{\ast} \mathbf{x} : = \mathbf{w}$ then $L \mathbf{w} = \mathbf{y}$

Since $L$ is a lower triangular matrix, the solution of the equation for $\mathbf{w}$ is found using the forward substitution method.


Step 5. Backward Substitution

Having found $\mathbf{w}$ and with $L^{\ast}$ being an upper triangular matrix, the solution of the equation $L^{\ast} \mathbf{x} = \mathbf{w}$ for $\mathbf{x}$ is found using the backward substitution method.

Explanation

The Cholesky decomposition originally had a strong precondition that the matrix to be decomposed must be a positive definite matrix. The key in this method lies in the trick of constructing the normal equation, which forcibly satisfies that stringent condition. Of course, this trick requires a guarantee that $A^{\ast} A$ is positive definite, and proving this is not too difficult.

Theorem

For the matrix $A \in \mathbb{C}^{m \times n} (m \ge n)$, $A^{\ast} A > 0 \iff \text{rank} A = n$

Proof

For any vector $\mathbf{v} \ne \mathbb{0}$,

$$ \mathbf{v}^{\ast} A^{\ast} A \mathbf{v} = (A \mathbf{v})^{\ast} A \mathbf{v} = | A \mathbf{v} |^{2} \ge 0 $$

thus, $A^{\ast} A \ge 0$. Assuming that the inverse of $A^{\ast} A$ does not exist implies the existence of $\mathbf{v} \ne \mathbb{0}$ satisfying $A^{\ast} A \mathbf{v} = \mathbb{0}$. Multiplying both sides of the equation by $\mathbf{v}^{\ast}$ on the left yields $| A \mathbf{v} |^2 = 0$, and from the definition of norm, $A \mathbf{v} = 0$. This means that the column vectors of $A$ are linearly dependent, thus $\text{rank} A < n$.

Conversely, assuming $\text{rank} A < n$ implies the existence of $\mathbf{v} \ne \mathbb{0}$ satisfying $A \mathbf{v} = \mathbb{0}$.

Multiplying both sides of the equation by $A^{\ast}$ on the left yields $A^{\ast} A \mathbf{v} = \mathbb{0}$, and $A^{\ast} A$ does not have an inverse. Since $A^{\ast}A \ge 0$,

$$ A^{\ast} A \ne 0 \iff \text{rank} A = n $$

therefore,

$$ A^{\ast} A > 0 \iff \text{rank} A = n $$