Matrix Exponentiation Method 📂Matrix Algebra

Matrix Exponentiation Method

Definition

The method of finding the dominant eigenvalue of the square matrix $A$ of size $n \times n$ by computing powers of $A$ is called the power method.

Description

To compute the eigenvalues of a given matrix one must solve its characteristic polynomial. However, when $5 \ge n$ there is no general formula for the roots, so algebraically computing the roots of the characteristic polynomial is not straightforward. This is difficult by hand and equally problematic for computers. The power method is a technique to address this: by computing powers of the matrix one can obtain an approximation to the dominant eigenvalue.

For a $n \times n$ matrix $A$ and a $n$-dimensional vector $\mathbf{x} \in \mathbb{R}^{n}$, the following sequence is called the power sequence generated by $A$.

$$ \left\{ \mathbf{x},\quad A\mathbf{x},\quad A^{2}\mathbf{x},\quad \dots \right\} $$

If powers are repeatedly computed, the vector norm can become excessively large or small, making computation difficult. To resolve this, one can either normalize the vector at each step or scale by the vector’s largest component magnitude, as stated in the theorem below.

Theorem

The theorem implies that by repeatedly multiplying an arbitrary initial vector $\mathbf{x}_{0}$ by the matrix $A$ one can recover the dominant eigenvalue. It is assumed that $\mathbf{x}_{0}$ is not orthogonal to the dominant eigenspace; however, the probability that a randomly chosen $\mathbf{x}_{0}$ is orthogonal to it is negligible.
The theorem assumes the dominant eigenvalue is positive; if it is negative then the sequence $(1)$ will oscillate, approaching unit dominant eigenvectors with alternating directions. Similarly, $(2)$ will converge to the absolute value of the dominant eigenvalue.
If $A$ is diagonalizable it has $n$ linearly independent eigenvectors, so the conditions of the theorem are satisfied.
If $A$ is a symmetric matrix it has $n$ independent eigenvectors, so the conditions of the theorem are satisfied.

$\ell^{2}$ Power method by normalization

Let $A$ be a $n \times n$ matrix that has $n$ linearly independent eigenvectors. Suppose the dominant eigenvalue $\lambda_{1}$ of $A$ is positive. Then for any unit vector $\mathbf{x}_{0} \in \mathbb{R}^{n}$ that is not orthogonal to the eigenspace corresponding to $\lambda_{1}$, the following normalized power sequence converges to the dominant unit eigenvector.

$$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\| A \mathbf{x}_{0} \|},\quad \mathbf{x}_{2} = \dfrac{ A \mathbf{x}_{1} }{\| A \mathbf{x}_{1} \|},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A \mathbf{x}_{k-1} }{\| A \mathbf{x}_{k-1} \|},\quad \dots \tag{1} $$

Moreover, the following sequence converges to the dominant eigenvalue.

$$ A \mathbf{x}_{1} \cdot \mathbf{x}_{1}, \quad A \mathbf{x}_{2} \cdot \mathbf{x}_{2}, \quad \dots, \quad A \mathbf{x}_{k} \cdot \mathbf{x}_{k}, \quad \dots \tag{2} $$

On the other hand, $(1)$ is equal to the following sequence.

$$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\| A \mathbf{x}_{0} \|},\quad \mathbf{x}_{2} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots $$

Power method using maximum-component scaling

Let $A$ be a $n \times n$ matrix that has $n$ linearly independent eigenvectors. Suppose the dominant eigenvalue $\lambda_{1}$ of $A$ is positive. Then for any unit vector $\mathbf{x}_{0} \in \mathbb{R}^{n}$ that is not orthogonal to the eigenspace associated with $\lambda_{1}$, the following scaled power sequence converges to the dominant unit eigenvector.

$$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\max(A \mathbf{x}_{0})},\quad \mathbf{x}_{2} = \dfrac{ A \mathbf{x}_{1} }{\max(A \mathbf{x}_{1})},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A \mathbf{x}_{k-1} }{\max(A \mathbf{x}_{k-1})},\quad \dots \tag{3} $$

Here $\max (\mathbf{x})$ denotes the largest absolute value among the components of the vector $\mathbf{x}$. Also, the following sequence converges to the dominant eigenvalue.

$$ \dfrac{A \mathbf{x}_{1} \cdot \mathbf{x}_{1}}{\mathbf{x}_{1} \cdot \mathbf{x}_{1}}, \quad \dfrac{A \mathbf{x}_{2} \cdot \mathbf{x}_{2}}{\mathbf{x}_{2} \cdot \mathbf{x}_{2}}, \quad \dots, \quad \dfrac{A \mathbf{x}_{k} \cdot \mathbf{x}_{k}}{\mathbf{x}_{k} \cdot \mathbf{x}_{k}}, \quad \dots $$

Meanwhile $(3)$ is equal to the following sequence.

$$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\max(A \mathbf{x}_{0})},\quad \mathbf{x}_{2} = \dfrac{ A^{2} \mathbf{x}_{0} }{\max(A^{2} \mathbf{x}_{0})},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A^{k} \mathbf{x}_{0} }{\max(A^{k} \mathbf{x}_{0})},\quad \dots $$

Proof

We prove the $\ell^{2}$ power method by normalization. Denote the $n$ linearly independent eigenvectors of $A$ by $\mathbf{v}_{i}$. Since they form a basis of $\mathbb{R}^{n}$ (../3017), we can express $\mathbf{x}_{0}$ as follows.

$$ \mathbf{x}_{0} = c_{1}\mathbf{v}_{1} + \cdots c_{2}\mathbf{v}_{2} + \cdots + c_{n} \mathbf{v}_{n} \quad (c_{1} \ne 0) $$

Because $\mathbf{x}_{0}$ is not orthogonal to the eigenspace of $\lambda_{1}$, $c_{1}$ is not $0$. Multiplying both sides by $A^{k}$ and rearranging yields:

$$ \begin{align*} A^{k}\mathbf{x}_{0} &= c_{1}A^{k}\mathbf{v}_{1} + \cdots c_{2}A^{k}\mathbf{v}_{2} + \cdots + c_{n}A^{k}\mathbf{v}_{n} \\ &= c_{1}\lambda_{1}^{k}\mathbf{v}_{1} + \cdots c_{2}\lambda_{2}^{k}\mathbf{v}_{2} + \cdots + c_{n}\lambda_{n}^{k}\mathbf{v}_{n} \\ &= \lambda_{1}^{k} \left[ c_{1}\mathbf{v}_{1} + \cdots c_{2}\left( \dfrac{\lambda_{2}}{\lambda_{1}} \right)^{k}\mathbf{v}_{2} + \cdots + c_{n}\left( \dfrac{\lambda_{n}}{\lambda_{1}} \right)^{k}\mathbf{v}_{n} \right] \\ \end{align*} $$

Since $\lambda_{1}$ is the dominant eigenvalue, the values inside each parentheses converge to $c_{1}\mathbf{v}_{1}$ as $k \to \infty$.

Lemma
$\mathbf{x}_{k}$ can be represented for the initial unit vector $\mathbf{x}_{0}$ as $\mathbf{x}_{k} = A^{k} \mathbf{x}_{0} / \left\| A^{k} \mathbf{x}_{0} \right\|$. That is, the sequence $(1)$ is also given by: $$ \mathbf{x}_{0},\quad \mathbf{x}_{1} = \dfrac{ A \mathbf{x}_{0} }{\| A \mathbf{x}_{0} \|},\quad \mathbf{x}_{2} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots,\quad \mathbf{x}_{k} = \dfrac{ A^{k} \mathbf{x}_{0} }{\| A^{k} \mathbf{x}_{0} \|},\quad \dots $$

By the lemma above, $\mathbf{x}_{k}$ converges to the dominant unit eigenvector. From this it follows that $A \mathbf{x}_{k} \cdot \mathbf{x}_{k}$ also converges to the dominant eigenvalue.

■

Proof of the lemma

We prove this by mathematical induction. The statement is true for $k=1$. Assume it holds for $k=m$. Then one can show it also holds for $k=m+1$ as follows.

$$ \mathbf{x}_{m+1} = \dfrac{ A \mathbf{x}_{m} }{ \left\| A \mathbf{x}_{m} \right\| } = \dfrac{1}{\left\| A^{m}\mathbf{x}_{0} \right\|}\dfrac{ A (A^{m}\mathbf{x}_{0}) }{ \left\| \dfrac{A (A^{m}\mathbf{x}_{0})}{ \left\| A^{m}\mathbf{x}_{0} \right\|} \right\| } = \dfrac{A^{m+1} \mathbf{x}_{0}}{\left\| A^{m+1} \mathbf{x}_{0} \right\|} $$

■