Matrix Exponentiation Method 📂Matrix Algebra

Matrix Exponentiation Method

Definition

Given a square matrix $A$ of size $n \times n$, the method of computing powers of $A$ to obtain its dominant eigenvalue is called the power method (거듭제곱법).

Explanation

To compute the eigenvalues of a given matrix one must solve its characteristic polynomial. However, when $5 \ge n$ there is no general formula for the roots, so algebraically computing the roots of the characteristic polynomial is not straightforward. This is difficult by hand and likewise problematic for computers. The power method is a technique to address this by approximating the dominant eigenvalue via repeated matrix powers.

For the $n \times n$ matrix $A$ and a $n$-dimensional vector $\mathbf{x} \in \mathbb{R}^{n}$, the following sequence is called the power sequence generated by $A$.

$$ \left\{ \mathbf{x},\quad A\mathbf{x},\quad A^{2}\mathbf{x}, \dots \right\} $$

Theorem

Assume the $n \times n$ matrix $A$ has $n$ linearly independent eigenvectors. Let $\lambda_{1}$ be the dominant eigenvalue of $A$ and suppose $\lambda_{1}$ is positive. Then for any unit vector $\mathbf{x}_{0} \in \mathbb{R}^{n}$ that is not orthogonal to the eigenspace corresponding to $\lambda_{1}$, the following normalized power sequence converges to the dominant unit eigenvector.

$$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\| A \mathbf{x}_{0} \|},\quad \mathbf{x}_{2} = \dfrac{ A \mathbf{x}_{1} }{\| A \mathbf{x}_{1} \|},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A \mathbf{x}_{k-1} }{\| A \mathbf{x}_{k-1} \|},\quad \dots \tag{1} $$

Moreover, the following sequence converges to the dominant eigenvalue.

$$ A \mathbf{x}_{1} \cdot \mathbf{x}_{1}, \quad A \mathbf{x}_{2} \cdot \mathbf{x}_{2}, \quad \dots, \quad A \mathbf{x}_{k} \cdot \mathbf{x}_{k}, \quad \dots \tag{2} $$

By the theorem, repeatedly multiplying an arbitrary initial vector $\mathbf{x}_{0}$ by the matrix $A$ yields the dominant eigenvalue. It is necessary to assume $\mathbf{x}_{0}$ is not orthogonal to the dominant eigenspace, but the probability that a randomly chosen $\mathbf{x}_{0}$ is orthogonal is extremely small.
The theorem assumes the dominant eigenvalue is positive. If it is negative, the sequence $(1)$ will approach unit dominant eigenvectors of alternating direction and thus oscillate. Similarly, $(2)$ will converge to the absolute value of the dominant eigenvalue.
If $A$ is diagonalizable, then it has $n$ independent eigenvectors, so the hypotheses of the theorem are satisfied.
If $A$ is a symmetric matrix, then it has $n$ independent eigenvectors, so the hypotheses of the theorem are satisfied.

Proof

Denote the $n$ linearly independent eigenvectors of $A$ by $\mathbf{v}_{i}$. Since they form a basis of $\mathbb{R}^{n}$, the vector $\mathbf{x}_{0}$ can be written as

$$ \mathbf{x}_{0} = c_{1}\mathbf{v}_{1} + \cdots c_{2}\mathbf{v}_{2} + \cdots + c_{n} \mathbf{v}_{n} \quad (c_{1} \ne 0) $$

Because $\mathbf{x}_{0}$ is not orthogonal to the eigenspace of $\lambda_{1}$, $c_{1}$ is not $0$. Multiplying both sides by $A^{k}$ and rearranging yields

$$ \begin{align*} A^{k}\mathbf{x}_{0} &= c_{1}A^{k}\mathbf{v}_{1} + \cdots c_{2}A^{k}\mathbf{v}_{2} + \cdots + c_{n}A^{k}\mathbf{v}_{n} \\ &= c_{1}\lambda_{1}^{k}\mathbf{v}_{1} + \cdots c_{2}\lambda_{2}^{k}\mathbf{v}_{2} + \cdots + c_{n}\lambda_{n}^{k}\mathbf{v}_{n} \\ &= \lambda_{1}^{k} \left[ c_{1}\mathbf{v}_{1} + \cdots c_{2}\left( \dfrac{\lambda_{2}}{\lambda_{1}} \right)^{k}\mathbf{v}_{2} + \cdots + c_{n}\left( \dfrac{\lambda_{n}}{\lambda_{1}} \right)^{k}\mathbf{v}_{n} \right] \\ \end{align*} $$

Since $\lambda_{1}$ is the dominant eigenvalue, the quantity inside each parentheses converges to $c_{1}\mathbf{v}_{1}$ when $k \to \infty$.

Lemma
The vector $\mathbf{x}_{k}$ can be expressed for the initial unit vector $\mathbf{x}_{0}$ as $\mathbf{x}_{k} = A^{k} \mathbf{x}_{0} / \left\| A^{k} \mathbf{x}_{0} \right\|$. In other words, the sequence $(1)$ can also be written as $$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\| A \mathbf{x}_{0} \|},\quad \mathbf{x}_{2} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots $$

By the above lemma, $\mathbf{x}_{k}$ converges to the dominant unit eigenvector. From this it follows that $A \mathbf{x}_{k} \cdot \mathbf{x}_{k}$ also converges to the dominant eigenvalue.

■

Proof of the Lemma

We prove by mathematical induction. The case $k=1$ is known to hold. Assume the statement holds for $k=m$. Then one can show that it also holds for $k=m+1$ as follows.

$$ \mathbf{x}_{m+1} = \dfrac{ A \mathbf{x}_{m} }{ \left\| A \mathbf{x}_{m} \right\| } = \dfrac{1}{\left\| A^{m}\mathbf{x}_{0} \right\|}\dfrac{ A (A^{m}\mathbf{x}_{0}) }{ \left\| \dfrac{A (A^{m}\mathbf{x}_{0})}{ \left\| A^{m}\mathbf{x}_{0} \right\|} \right\| } = \dfrac{A^{m+1} \mathbf{x}_{0}}{\left\| A^{m+1} \mathbf{x}_{0} \right\|} $$

■