Generalized Eigenvector

Introduction¹

The matrix eigenvalue problem is, for a given matrix $A \in M_{n \times n}(\mathbb{C})$, to find a vector $\mathbf{0} \ne \mathbf{v} \in \mathbb{C}^{n}$ and a scalar $\lambda \in \mathbb{C}$ that satisfy the following.

$$ A \mathbf{v} = \lambda \mathbf{v} \iff (A - \lambda I) \mathbf{v} = \mathbf{0} $$

Here $\lambda$ is called an eigenvalue of $A$, and $\mathbf{v}$ is called an eigenvector of $A$. If $A$ has $n$ linearly independent eigenvectors, it is diagonalizable and can be decomposed into a nice form. However, an arbitrary matrix does not always have $n$ linearly independent eigenvectors. In such cases one seeks an appropriate alternative and attempts another form of decomposition; when $A$ is not square, singular value decomposition is an example. Here we consider the case where $A$ is square but has fewer than $n$ linearly independent eigenvectors.

As the above equation shows, an eigenvector is also an element of the kernel $\ker (A - \lambda I)$. That $A$ is not diagonalizable means the dimension of $\ker (A - \lambda I)$ is not large enough (i.e., smaller than $n$). However, the kernel grows monotonically when the same transformation is applied repeatedly. (See monotone growth.)

$$ \ker(A - \lambda I) \subset \ker(A - \lambda I)^{2} \subset \cdots \subset \ker(A - \lambda I)^{k} \subset \cdots $$

Thus, for any natural number $k$, if we consider the set of vectors $\mathbf{v}$ that satisfy the equation below, this set contains all the ordinary eigenvectors and is a natural generalization.

$$ (A - \lambda I)^{k} \mathbf{v} = \mathbf{0} $$

Definition

Let a matrix $A \in M_{n \times n}(\mathbb{C})$ be given. For an arbitrary scalar $\lambda \in \mathbb{C}$ and natural number $k \in \mathbb{N}$, a vector $\mathbf{v}$ satisfying the equation below is called a generalized eigenvector of $A$.

$$ (A - \lambda I)^{k} \mathbf{v} = \mathbf{0} \tag{1} $$

Explanation

Here the notion only makes sense for vectors that are $\mathbf{v} \ne \mathbf{0}$, so $(A - \lambda I)$ must not be an invertible matrix. Therefore $\lambda$ in $(1)$ is the same as an ordinary eigenvalue. That is, values $\lambda$ satisfying $(1)$ are not called generalized eigenvalues. However, as one can infer from the definition, generalized eigenvectors that are not ordinary eigenvectors do exist.

For example, suppose the matrix $A = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}$ given in $2 \times 2$ is considered. Since the characteristic polynomial is $(\lambda - 1)^{2} = 0$, $A$ has a single eigenvalue $\lambda = 1$ with algebraic multiplicity $2$. The corresponding (linearly independent) eigenvectors are only one as shown below, so $A$ is not diagonalizable. Call this $\mathbf{x}_{1} = \begin{bmatrix} x_{1} & y_{1} \end{bmatrix}^{\mathsf{T}}$.

$$ \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{1} \\ y_{1} \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \implies y_{1} = 0 \implies \mathbf{x}_{1} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} $$

Now consider the case $k = 2$. We can find $\mathbf{x}_{1}$ and a linearly independent $\mathbf{x}_{2}$ as follows. Call them $\mathbf{x}_{2} = \begin{bmatrix} x_{2} & y_{2} \end{bmatrix}^{\mathsf{T}}$.

$$ (A - \lambda I)^{2} \mathbf{x} = \mathbf{0} \implies \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}\left( \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{2} \\ y_{2} \end{bmatrix} \right) = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $$

Since the value inside the parentheses is the eigenvector $\mathbf{x}_{1}$, we obtain the following.

$$ \begin{align*} && \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} x_{2} \\ y_{2} \end{bmatrix} &= \begin{bmatrix} 1 \\ 0 \end{bmatrix} \\ \implies&& \begin{bmatrix} y_{2} \\ 0 \end{bmatrix} &= \begin{bmatrix} 1 \\ 0 \end{bmatrix} \\ \implies&& \mathbf{x}_{2} &= \begin{bmatrix} c \\ 1 \end{bmatrix} \end{align*} $$

Here $c$ is an arbitrary constant. Looking back at the equations for finding the eigenvector $\mathbf{x}_{1}$ and the generalized eigenvector $\mathbf{x}_{2}$, we see that $\mathbf{x}_{2}$ is a vector that becomes $\mathbf{x}_{1}$ after applying $(A - \lambda I)$.

$$ \begin{align*} && (A - \lambda I) \mathbf{x}_{1} &= \mathbf{0} \\ && (A - \lambda I)^{2} \mathbf{x}_{2} = (A - \lambda I) \left[ (A - \lambda I) \mathbf{x}_{2} \right] &= \mathbf{0} \\ \implies && (A - \lambda I) \mathbf{x}_{2} &= \mathbf{x}_{1} \end{align*} $$

Thus, the generalized eigenvectors corresponding to the eigenvalue $\lambda = 1$ are (including the ordinary eigenvectors) exactly $2$ in number. It is not a coincidence that this equals the algebraic multiplicity of $\lambda = 1$. The geometric multiplicity of an eigenvalue is always less than or equal to its algebraic multiplicity; the geometric multiplicity equals the number of linearly independent ordinary eigenvectors corresponding to the eigenvalue, and the algebraic multiplicity equals the number of linearly independent generalized eigenvectors corresponding to the eigenvalue.

Generalized eigenspace

The eigenspace $E_{\lambda}$ for the eigenvalue $\lambda$ is the space spanned by the eigenvectors corresponding to $\lambda$. As an extension, the generalized eigenspace $W_{\lambda}$ is defined as the space spanned by the generalized eigenvectors corresponding to $\lambda$ as follows. For a matrix $A \in M_{n \times n}(\mathbb{C})$ and its eigenvalue $\lambda$,

$$ W_{\lambda} = \left\{ \mathbf{v} \in \mathbb{C}^{n} : (A - \lambda I)^{k} \mathbf{v} = \mathbf{0} \text{ for some } k \in \mathbb{N} \right\} $$

Brian C. Hall. Lie Groups, Lie Algebras, and Representations (2nd), p411-413 ↩︎