Diagonalization of Matrices
Definition1
If a square matrix $A$ is similar to some diagonal matrix $D$ via similarity, we say $A$ is diagonalizable, or that $P$ diagonalizes $A$. In other words, $A$ is called a diagonalizable matrix if there exists an invertible matrix $P$ satisfying the following.
$$ A = P^{-1}DP $$
Explanation
If the matrix $A$ is diagonalizable, computing its powers becomes very easy. If one computes $A^{k}$ directly, one must perform matrix multiplication $k-1$ times. Since $A = PDP^{-1}$, it is $A^{k} = PD^{k}P^{-1}$. Because the power of a diagonal matrix is the matrix of the powers of its diagonal entries (see diagonal matrices), $D^{k}$ can be obtained easily, and one only needs to perform the matrix multiplication $PD^{k}P^{-1}$, so the $k-1$ matrix multiplications can be reduced to $2$. Thus, for any $k$, $A^{k}$ can be computed in constant time.
- Orthogonal diagonalization of symmetric matrices
- Eigenvalue diagonalization of invertible matrices
- Unitary diagonalization of normal matrices (Spectral Theorem)
By the property below, a diagonalizable matrix has the entries of its diagonal matrix equal to its eigenvalues, so this is also called diagonalization by eigenvalues or eigenvalue decomposition.
Properties
(a) If a matrix is orthogonally diagonalizable, then it is diagonalizable. The converse does not hold.
For a $n \times n$ matrix $A$, the following two statements are equivalent.
(a) $A$ is diagonalizable.
(b) $A$ has $n$ linearly independent eigenvectors.
Corollary
A set of linearly independent eigenvectors of $A$ diagonalizes $A$.
Proof
(a)
This is immediate from the definition of orthogonal diagonalization. That the converse does not hold is seen from the counterexample below. The eigenvalues of $A$ are $2$ and $3$, and it is diagonalizable by the matrix whose columns are the corresponding eigenvectors $\left\{ \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right\}$. However, it is not a symmetric matrix and therefore cannot be orthogonally diagonalized (see orthogonal diagonalization of symmetric matrices).
$$ A = \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix}, \quad P = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}, \quad P^{-1} = \begin{bmatrix} 1 & -1 \\ 0 & 1 \end{bmatrix}\ne P^{\mathsf{T}} $$
$$ P^{-1}AP = \begin{bmatrix} 2 & 0 \\ 0 & 3 \end{bmatrix} $$
■
(a) $\implies$ (b)
Assume $A$ is diagonalizable. Then by definition there exist an invertible matrix $P$ and a diagonal matrix $D$ such that
$$ AP = PD \tag{1} $$
Denote the column vectors of $A$ and $P$ by $\mathbf{a}_{i}$ and $\mathbf{p}_{i}$, respectively.
$$ A = \begin{bmatrix} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{bmatrix} \qquad P = \begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} $$
Matrix multiplication
$$ AP = \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} $$
Product of diagonal matrices
$$ PD = \begin{bmatrix} d_{1}\mathbf{p}_{1} & d_{2}\mathbf{p}_{2} & \cdots & d_{n}\mathbf{p}_{n} \end{bmatrix} $$
From $(1)$ and the properties above, we obtain
$$ \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} d_{1}\mathbf{p}_{1} & d_{2}\mathbf{p}_{2} & \cdots & d_{n}\mathbf{p}_{n} \end{bmatrix} $$
$$ \implies A\mathbf{p}_{i} = d_{i}\mathbf{p}_{i} \quad \forall i = 1, \dots, n $$
Therefore $\mathbf{p}_{i}$ is an eigenvector of $A$.
Equivalent condition for invertibility](../3004): For a $n \times n$ matrix $P$,
$$ \text{$P$ is invertible $\iff$ the columns of $P$ are linearly independent} $$
By this equivalence, $\mathbf{p}_{1}, \dots, \mathbf{p}_{n}$ is linearly independent.
■
(b) $\implies$ (a)
Suppose $A$ has $n$ linearly independent eigenvectors $\mathbf{p}_{i}$. Let the corresponding eigenvalues be $\lambda_{i}$, then
$$ A \mathbf{p}_{i} = \lambda_{i} \mathbf{p}_{i} \quad \forall i = 1, \dots, n $$
It follows that
$$ \begin{bmatrix} A\mathbf{p}_{1} & A\mathbf{p}_{2} & \cdots & A\mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} \lambda_{1}\mathbf{p}_{1} & \lambda_{2}\mathbf{p}_{2} & \cdots & \lambda_{n}\mathbf{p}_{n} \end{bmatrix} $$
If we set $D = \diag (\lambda_{1}, \dots, \lambda_{n})$, then
$$ A\begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} = \begin{bmatrix} \mathbf{p}_{1} & \mathbf{p}_{2} & \cdots & \mathbf{p}_{n} \end{bmatrix} D $$
$$ \implies AP = P D $$
Since $\mathbf{p}_{i}$ are linearly independent, $P$ is invertible, and hence $A$ is diagonalizable.
■
Howard Anton. Elementary Linear Algebra: Aplications Version (12th Edition, 2019), p301-302 ↩︎