正定値行列の固有値と二次形式の最大値 📂線形代数

正定値行列の固有値と二次形式の最大値

定理

正定値行列 $A \in \mathbb{R}^{p \times p}$ の固有対 $\left\{ \left( \lambda_{k} , e_{k} \right) \right\}_{k=1}^{n}$ が $\lambda_{1} \ge \cdots \ge \lambda_{n} \ge 0$ の順に並んでいるとする。単位球^{unit Sphere} 上での二次形式 $\mathbf{x}^{T} A \mathbf{x}$ の最大値と最小値は次の通りである。 $\begin{align*} \max_{\left\| \mathbf{x} \right\| = 1} \mathbf{x}^{T} A \mathbf{x} =& \lambda_{1} & \text{, attained when } \mathbf{x} = e_{1} \\ \min_{\left\| \mathbf{x} \right\| = 1} \mathbf{x}^{T} A \mathbf{x} =& \lambda_{p} & \text{, attained when } \mathbf{x} = e_{p} \end{align*}$ 一方で $k = 2, \cdots , p-1$ について次が成り立つ。 $\max_{\substack{\left\| \mathbf{x} \right\| = 1 \\ \mathbf{x} \perp e_{1} , \cdots , e_{k-1} }} \mathbf{x}^{T} A \mathbf{x} = \lambda_{k} \quad \text{, attained when } \mathbf{x} = e_{k}$

$X^{T}$ は行列 $X$ の転置行列だ。
$\left\| \cdot \right\|$ はユークリッドノルムである。
$\perp$ はベクトルの直交を意味する。

説明

与えられた定理は、 $\left\| \mathbf{x} \right\| = 1$ を満たすベクトル、つまり単位球上の $\mathbf{x}$ に対する二次形式が固有値とどのような関係にあるのかを説明している。特に、最後の公式は、解空間で $(k-1)$ 番目までの固有ベクトルを除外した場合、二次形式の最大値が具体的に $\lambda_{k}$ と一致することを示している。

証明 ¹

特定の $\mathbf{x} \ne 0$ に対して、 $\mathbf{x}_{1} = \mathbf{x} / \sqrt{ \mathbf{x}^{T} \mathbf{x} }$ とおくと $\left\| \mathbf{x}_{1} \right\| = 1$ となり、 ${{ \mathbf{x}^{T} A \mathbf{x} } \over { \mathbf{x}^{T} \mathbf{x} }} = \mathbf{x}_{1}^{T} A \mathbf{x}_{1}$ したがって、証明では $\left\| \mathbf{x} \right\| = 1$ という条件をそのまま使うのではなく、 ${{ \mathbf{x}^{T} B \mathbf{x} } \over { \mathbf{x}^{T} \mathbf{x} }}$ を直接扱うことにしよう。

パート1. $\max \mathbf{x}^{T} A \mathbf{x} = \lambda_{1}$

正定値行列の平方根行列: 正定値行列 $A$ の固有対 $\left\{ \left( \lambda_{k} , e_{k} \right) \right\}_{k=1}^{n}$ が $\lambda_{1} > \cdots > \lambda_{n} > 0$ の順で並んでいるとする。直行行列 $P = \begin{bmatrix} e_{1} & \cdots & e_{n} \end{bmatrix} \in \mathbb{R}^{n \times n}$ と対角行列 $\Lambda = \diag \left( \lambda_{1} , \cdots , \lambda_{n} \right)$ に対して、 $A$ の平方根行列 $\sqrt{A}$ は次の通りである。 $\sqrt{A} = P \sqrt{\Lambda} P^{T} = \sum_{k=1}^{n} \sqrt{\lambda_{k}} e_{k} e_{k}^{T}$

上記の定義で言及されたように、直行行列 $P$ と対角行列 $\Lambda$ が定義されているとすると、ベクトル $\mathbf{y} \in \mathbb{R}^{p}$ を $\mathbf{y} = \left( y_{1} , \cdots , y_{p} \right) := P^{T} \mathbf{x}$ のように設定する。これにより、 $\mathbf{x} \ne \mathbf{0} \implies \mathbf{y} \ne \mathbf{0}$ となり、したがって単位行列 $I \in \mathbb{R}^{p \times p}$ に対して次が成り立つ。 $\begin{align*} & {{ \mathbf{x}^{T} A \mathbf{x} } \over { \mathbf{x}^{T} \mathbf{x} }} \\ =& {{ \mathbf{x}^{T} \sqrt{A} \sqrt{A} \mathbf{x} } \over { \mathbf{x}^{T} I \mathbf{x} }} \\ =& {{ \mathbf{x}^{T} P \sqrt{\Lambda} P^{T} P \sqrt{\Lambda} P^{T} \mathbf{x} } \over { \mathbf{x}^{T} P P^{T} \mathbf{x} }} \\ =& {{ \mathbf{y}^{T} \sqrt{\Lambda} I \sqrt{\Lambda} \mathbf{y} } \over { \mathbf{y}^{T} \mathbf{y} }} \\ =& {{ \mathbf{y}^{T} \Lambda \mathbf{y} } \over { \mathbf{y}^{T} \mathbf{y} }} \\ =& {{ \sum_{i=1}^{p} \lambda_{i} y_{i}^{2} } \over { \sum_{i=1}^{p} y_{i}^{2} }} \\ \le& \lambda_{1} {{ \sum_{i=1}^{p} y_{i}^{2} } \over { \sum_{i=1}^{p} y_{i}^{2} }} & \cdots \star \\ =& \lambda_{1} \end{align*}$ $P = \begin{bmatrix} e_{1} & \cdots & e_{n} \end{bmatrix}$ であり、異なる $i \ne j$ に対しては $e_{i}^{T} e_{j} = 0$ 、インデックスが同じ場合は $e_{i}^{T} e_{i} = 1$ なので、もし $\mathbf{x} = e_{1}$ を選ぶと、 $\mathbf{y} = P^{T} e_{1} = \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} = e_{1}$ となり、したがって、 $\begin{align*} & {{ \mathbf{x}^{T} A \mathbf{x} } \over { \mathbf{x}^{T} \mathbf{x} }} \\ =& {{ \mathbf{y}^{T} \Lambda \mathbf{y} } \over { \mathbf{y}^{T} \mathbf{y} }} \\ =& {{ e_{1}^{T} \Lambda e_{1} } \over { e_{1}^{T} e_{1} }} \\ =& \lambda_{1} \end{align*}$ となる。したがって、 $\max_{\left\| \mathbf{x} \right\| = 1} \mathbf{x}^{T} A \mathbf{x} = \lambda_{1} \quad \text{, attained when } \mathbf{x} = e_{1}$ 同様の方法で最小値も次のように求めることができる。 $\star$ と示された展開で、 $\lambda_{1}$ が $\lambda_{p}$ になり、不等式の向きが逆になる。 $\min_{\left\| \mathbf{x} \right\| = 1} \mathbf{x}^{T} A \mathbf{x} = \lambda_{p} \quad \text{, attained when } \mathbf{x} = e_{p}$

パート2. $\max_{\mathbf{x} \perp e_{1} , \cdots , e_{k-1} } \mathbf{x}^{T} A \mathbf{x} = \lambda_{k}$

$\mathbf{y}$ が $\mathbf{y} = P^{T} \mathbf{x}$ として定義され、 $P$ が直行行列であるため、 $P \mathbf{y} = \mathbf{x}$ となる。この $\mathbf{x}$ は固有ベクトル $e_{1} , \cdots , e_{p}$ によって次のように表すことができる。 $\begin{align*} \mathbf{x} =& P \mathbf{y} \\ =& y_{1} e_{1} + y_{2} e_{2} + \cdots + y_{p} e_{p} \end{align*}$ $\mathbf{x} \perp e_{1} , \cdots , e_{k-1}$ とすると、 $i < k$ のすべての $e_{i}$ に対する内積は、 $\begin{align*} 0 =& e_{i}^{T} \mathbf{x} \\ =& y_{1} e_{i}^{T} e_{1} + y_{2} e_{i}^{T} e_{2} + \cdots + y_{p} e_{i}^{T} e_{p} \\ =& y_{i} e_{i}^{T} e_{i} \\ =& y_{i} \end{align*}$ となる。これにより、パート1で出てきた $\star$ を再度見ると、 ${{ \mathbf{x}^{T} A \mathbf{x} } \over { \mathbf{x}^{T} \mathbf{x} }} = {{ \sum_{i=k}^{p} \lambda_{i} y_{i}^{2} } \over { \sum_{i=k}^{p} y_{i}^{2} }} \le \lambda_{k} {{ \sum_{i=k}^{p} y_{i}^{2} } \over { \sum_{i=k}^{p} y_{i}^{2} }}$ $y_{k} = 1$ と $y_{k+1} = \cdots = y_{p} = 0$ を設定した場合、つまり $\mathbf{x} = \sum_{i=1}^{p} y_{i} e_{i} = e_{k}$ の時、最大値が得られる。

■

Johnson. (2013). Applied Multivariate Statistical Analysis(6th Edition): p118~119. ↩︎

正定値行列の固有値と二次形式の最大値

定理

説明

証明 1

証明 ¹