Cayley-Hamilton Theorem
Definition1
Let $T : V \to V$ be a linear transformation on a finite-dimensional vector space $V$. Let $f(t)$ be the characteristic polynomial of $T$. Then, the following holds:
$$ f(T) = T_{0} $$
Here, $T_{0}$ is the zero transformation. In other words, a linear transformation satisfies its own characteristic polynomial. Rewriting this theorem from the perspective of matrices,
Corollary
Square matrices satisfy their own characteristic equations.
$$ f(A) = O $$
Explanation
The owner and the customers of the same age must have learned about matrices in high school, and what they saw then is precisely this Cayley-Hamilton theorem. (Apparently, it wasn’t part of the curriculum, just like L’Hôpital’s Rule2)
For a 2nd order square matrix $A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$, the following holds: $$ A^{2} -(a + d)A + (ad - bc)I = O $$
Proof
What we need to show is that for every $\mathbf{v} \in V$, $f(T)(\mathbf{v}) = \mathbf{0}$ holds. Since $T$ is a linear transformation, the case where $\mathbf{v} = \mathbf{0}$ is trivial. Assume $\mathbf{v} \ne \mathbf{0}$.
Let $W$ be the $T$-cyclic subspace generated by $\mathbf{v}$, and let $k = \dim(W)$.
$\left\{ \mathbf{v}, T\mathbf{v}, \dots, T^{k-1}\mathbf{v} \right\}$ is a basis of $W$.
If $a_{0}\mathbf{v} + a_{1}T \mathbf{v} + \cdots + a_{k-1}T^{k-1} \mathbf{v} + T^{k}\mathbf{v} = \mathbf{0}$, then the characteristic polynomial of the restriction map $T|_{W}$ is $$ f(t) = (-1)^{k}\left( a_{0} + a_{1}t + \cdots +a_{k-1}t^{k-1} + t^{k} \right) $$
By Lemma 1., there exists a constant $a_{0}, a_{1}, \dots, a_{k-1}$ that satisfies the following:
$$ \begin{equation} a_{0}\mathbf{v} + a_{1}T\mathbf{v} + \cdots + a_{k-1}T^{k-1}\mathbf{v} + T^{k}\mathbf{v} = \mathbf{0} \end{equation} $$
Then, by Lemma 2., the characteristic polynomial of the restriction map $T|_{W}$ is as follows:
$$ \begin{equation} g(t) = (-1)^{k}\left( a_{0} + a_{1}t + \cdots +a_{k-1}t^{k-1} + t^{k} \right) \end{equation} $$
Hence, by $(1)$ and $(2)$, we obtain the following:
$$ g(T)(\mathbf{v}) = (-1)^{k}\left( a_{0}I + a_{1}T + \cdots +a_{k-1}T^{k-1} + T^{k} \right)(\mathbf{v}) = \mathbf{0} $$
If $W$ is an $T$-invariant subspace, then the characteristic polynomial of $T|_{W}$ divides the characteristic polynomial of $T$.
By the above Lemma, $g(t)$ divides the characteristic polynomial $f(t)$ of $T$. Therefore, for some polynomial $q(t)$, $f(t) = q(t)g(t)$ holds. Thus,
$$ f(T)(\mathbf{v}) = q(T)g(T)(\mathbf{v}) = g(T)\left( g(T)(\mathbf{v}) \right) = g(T)(\mathbf{0}) = \mathbf{0} $$
■
Stephen H. Friedberg, Linear Algebra (4th Edition, 2002), p317 ↩︎
https://namu.wiki/w/%EC%BC%80%EC%9D%BC%EB%A6%AC-%ED%95%B4%EB%B0%80%ED%84%B4%20%EC%A0%95%EB%A6%AC#s-2 ↩︎