Diagonalization of Matrices 📂Matrix Algebra

Diagonalization of Matrices

Definition¹

If a square matrix $A$ is similar to some diagonal matrix $D$ via similarity, we say $A$ is diagonalizable, or that $P$ diagonalizes $A$. In other words, $A$ is called a diagonalizable matrix if there exists an invertible matrix $P$ satisfying the following.

$$ A = P^{-1}DP $$

Explanation

If the matrix $A$ is diagonalizable, computing its powers becomes very easy. If one computes $A^{k}$ directly, one must perform matrix multiplication $k-1$ times. Since $A = PDP^{-1}$, it is $A^{k} = PD^{k}P^{-1}$. Because the power of a diagonal matrix is the matrix of the powers of its diagonal entries (see diagonal matrices), $D^{k}$ can be obtained easily, and one only needs to perform the matrix multiplication $PD^{k}P^{-1}$, so the $k-1$ matrix multiplications can be reduced to $2$. Thus, for any $k$, $A^{k}$ can be computed in constant time.

By the property below, a diagonalizable matrix has the entries of its diagonal matrix equal to its eigenvalues, so this is also called diagonalization by eigenvalues or eigenvalue decomposition.

Properties

(a) If a matrix is orthogonally diagonalizable, then it is diagonalizable. The converse does not hold.

For a $n \times n$ matrix $A$, the following two statements are equivalent.

(a) $A$ is diagonalizable.

(b) $A$ has $n$ linearly independent eigenvectors.

Corollary

A set of linearly independent eigenvectors of $A$ diagonalizes $A$.

Proof

(a)

This is immediate from the definition of orthogonal diagonalization. That the converse does not hold is seen from the counterexample below. The eigenvalues of $A$ are $2$ and $3$, and it is diagonalizable by the matrix whose columns are the corresponding eigenvectors $\left\{ \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right\}$. However, it is not a symmetric matrix and therefore cannot be orthogonally diagonalized (see orthogonal diagonalization of symmetric matrices).

$$ A = \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix}, \quad P = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}, \quad P^{-1} = \begin{bmatrix} 1 & -1 \\ 0 & 1 \end{bmatrix}\ne P^{\mathsf{T}} $$

$$ P^{-1}AP = \begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix} $$

■

(a) $\implies$ (b)

Assume $A$ is diagonalizable. Then by definition there exist an invertible matrix $P$ and a diagonal matrix $D$ such that

$$ AP = PD \tag{1} $$

Denote the column vectors of $A$ and $P$ by $\mathbf{a}_{i}$ and $\mathbf{p}_{i}$, respectively.

$$ A = \begin{bmatrix} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{bmatrix} \qquad P = \begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} $$

Matrix multiplication
$$ AP = \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} $$

Product of diagonal matrices
$$ PD = \begin{bmatrix} d_{1}\mathbf{p}_{1} & d_{2}\mathbf{p}_{2} & \cdots & d_{n}\mathbf{p}_{n} \end{bmatrix} $$

From $(1)$ and the properties above, we obtain

$$ \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} d_{1}\mathbf{p}_{1} & d_{2}\mathbf{p}_{2} & \cdots & d_{n}\mathbf{p}_{n} \end{bmatrix} $$

$$ \implies A\mathbf{p}_{i} = d_{i}\mathbf{p}_{i} \quad \forall i = 1, \dots, n $$

Therefore $\mathbf{p}_{i}$ is an eigenvector of $A$.

Equivalent condition for invertibility](../3004): For a $n \times n$ matrix $P$,
$$ \text{$P$ is invertible $\iff$ the columns of $P$ are linearly independent} $$

By this equivalence, $\mathbf{p}_{1}, \dots, \mathbf{p}_{n}$ is linearly independent.

■

(b) $\implies$ (a)

Suppose $A$ has $n$ linearly independent eigenvectors $\mathbf{p}_{i}$. Let the corresponding eigenvalues be $\lambda_{i}$, then

$$ A \mathbf{p}_{i} = \lambda_{i} \mathbf{p}_{i} \quad \forall i = 1, \dots, n $$

It follows that

$$ \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} \lambda_{1}\mathbf{p}_{1} & \lambda_{2}\mathbf{p}_{2} & \cdots & \lambda_{n}\mathbf{p}_{n} \end{bmatrix} $$

If we set $D = \diag (\lambda_{1}, \dots, \lambda_{n})$, then

$$ A\begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} D $$

$$ \implies AP = P D $$

Since $\mathbf{p}_{i}$ are linearly independent, $P$ is invertible, and hence $A$ is diagonalizable.

■

Howard Anton. Elementary Linear Algebra: Aplications Version (12th Edition, 2019), p301-302 ↩︎