logo

Cayley-Hamilton Theorem 📂Linear Algebra

Cayley-Hamilton Theorem

Definition1

Let T:VVT : V \to V be a linear transformation on a finite-dimensional vector space VV. Let f(t)f(t) be the characteristic polynomial of TT. Then, the following holds:

f(T)=T0 f(T) = T_{0}

Here, T0T_{0} is the zero transformation. In other words, a linear transformation satisfies its own characteristic polynomial. Rewriting this theorem from the perspective of matrices,

Corollary

Square matrices satisfy their own characteristic equations.

f(A)=O f(A) = O

Explanation

The owner and the customers of the same age must have learned about matrices in high school, and what they saw then is precisely this Cayley-Hamilton theorem. (Apparently, it wasn’t part of the curriculum, just like L’Hôpital’s Rule2)

For a 2nd order square matrix A=[abcd]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}, the following holds: A2(a+d)A+(adbc)I=O A^{2} -(a + d)A + (ad - bc)I = O

Proof

What we need to show is that for every vV\mathbf{v} \in V, f(T)(v)=0f(T)(\mathbf{v}) = \mathbf{0} holds. Since TT is a linear transformation, the case where v=0\mathbf{v} = \mathbf{0} is trivial. Assume v0\mathbf{v} \ne \mathbf{0}.

Let WW be the TT-cyclic subspace generated by v\mathbf{v}, and let k=dim(W)k = \dim(W).

Lemma on cyclic subspaces

  1. {v,Tv,,Tk1v}\left\{ \mathbf{v}, T\mathbf{v}, \dots, T^{k-1}\mathbf{v} \right\} is a basis of WW.

  2. If a0v+a1Tv++ak1Tk1v+Tkv=0a_{0}\mathbf{v} + a_{1}T \mathbf{v} + \cdots + a_{k-1}T^{k-1} \mathbf{v} + T^{k}\mathbf{v} = \mathbf{0}, then the characteristic polynomial of the restriction map TWT|_{W} is f(t)=(1)k(a0+a1t++ak1tk1+tk) f(t) = (-1)^{k}\left( a_{0} + a_{1}t + \cdots +a_{k-1}t^{k-1} + t^{k} \right)

By Lemma 1., there exists a constant a0,a1,,ak1a_{0}, a_{1}, \dots, a_{k-1} that satisfies the following:

a0v+a1Tv++ak1Tk1v+Tkv=0 \begin{equation} a_{0}\mathbf{v} + a_{1}T\mathbf{v} + \cdots + a_{k-1}T^{k-1}\mathbf{v} + T^{k}\mathbf{v} = \mathbf{0} \end{equation}

Then, by Lemma 2., the characteristic polynomial of the restriction map TWT|_{W} is as follows:

g(t)=(1)k(a0+a1t++ak1tk1+tk) \begin{equation} g(t) = (-1)^{k}\left( a_{0} + a_{1}t + \cdots +a_{k-1}t^{k-1} + t^{k} \right) \end{equation}

Hence, by (1)(1) and (2)(2), we obtain the following:

g(T)(v)=(1)k(a0I+a1T++ak1Tk1+Tk)(v)=0 g(T)(\mathbf{v}) = (-1)^{k}\left( a_{0}I + a_{1}T + \cdots +a_{k-1}T^{k-1} + T^{k} \right)(\mathbf{v}) = \mathbf{0}

Lemma on invariant subspaces

If WW is an TT-invariant subspace, then the characteristic polynomial of TWT|_{W} divides the characteristic polynomial of TT.

By the above Lemma, g(t)g(t) divides the characteristic polynomial f(t)f(t) of TT. Therefore, for some polynomial q(t)q(t), f(t)=q(t)g(t)f(t) = q(t)g(t) holds. Thus,

f(T)(v)=q(T)g(T)(v)=g(T)(g(T)(v))=g(T)(0)=0 f(T)(\mathbf{v}) = q(T)g(T)(\mathbf{v}) = g(T)\left( g(T)(\mathbf{v}) \right) = g(T)(\mathbf{0}) = \mathbf{0}