logo

Matrix Algebra: Projections 📂Matrix Algebra

Matrix Algebra: Projections

Definition

The projection PCm×mP \in \mathbb{C}^{m \times m} is called an orthogonal projection if it satisfies C(P)=N(P)\mathcal{C} (P) ^{\perp} = \mathcal{N} (P) and PP.

Explanation

According to the property of projection Cm=C(P)N(P)\mathbb{C}^{m } = \mathcal{C} (P) \oplus \mathcal{N} (P), it can be seen that PP divides Cm\mathbb{C}^{m} into exactly two subspaces, C(P)\mathcal{C} (P) and N(P)\mathcal{N} (P).

The fact that this division satisfies the condition N(P)=C(P)\mathcal{N} (P) = \mathcal{C} (P) ^{\perp} means that the null space N(P)\mathcal{N} (P) of the linear transformation PP is the orthogonal complement of the column space C(P)\mathcal{C} (P). Thus, it is a partition that includes orthogonality, making the definition of orthogonal projection quite valid.

On the other hand, the necessary and sufficient condition for the linear transformation PP to be an orthogonal projection is that PP is a Hermitian matrix.

The proof is rather difficult and messy, so it is recommended to know it as a fact when studying.

Theorem

C(P)=N(P)    P=P \mathcal{C} (P) ^{\perp} = \mathcal{N} (P) \iff P = P^{\ast}

Proof

()(\Longrightarrow)

When the orthonormal basis of Cm\mathbb{C}^{m} is {q1,,qm}\left\{ \mathbf{q}_{1} , \cdots , \mathbf{q}_{m} \right\} and we let dimC(P)=r\dim \mathcal{C} (P) = r, the orthonormal basis of C(P)\mathcal{C} (P) can be set to {q1,,qr}\left\{ \mathbf{q}_{1} , \cdots , \mathbf{q}_{r} \right\}. Since {q1,,qr}\left\{ \mathbf{q}_{1} , \cdots , \mathbf{q}_{r} \right\} is the basis of C(P)\mathcal{C} (P), there will exist some v\mathbf {v} that satisfies qi=Pv\mathbf{q}_{i} = P \mathbf{v}, and if PP is multiplied by this equation

Pqi=PPv=Pv=qi P \mathbf{q}_{i} = PP \mathbf{v} = P \mathbf{v} = \mathbf{q}_{i}

Meanwhile, since it is Cm=C(P)N(P)\mathbb{C}^{m} = \mathcal{C} (P) \oplus \mathcal{N} (P), the orthonormal basis of N(P)\mathcal{N} (P) will be {qr+1,,qm}\left\{ \mathbf{q}_{r +1} , \cdots , \mathbf{q}_{m} \right\}. When the matrix Q:=[q1qrqr+1qm]Q : = \begin{bmatrix} \mathbf{q}_{1} & \cdots & \mathbf{q}_{r} & \mathbf{q}_{r+1} & \cdots & \mathbf{q}_{m} \end{bmatrix} is constructed with the vectors of {q1,,qr}\left\{ \mathbf{q}_ {1} , \cdots , \mathbf{q}_{r} \right\}, QQ becomes a unitary matrix, and when PQPQ is calculated

PQ=P[q1qrqr+1qm]=[Pq1PqrPqr+1Pqm]=[q1qr00] \begin{align*} PQ =& P\begin{bmatrix} \mathbf{q}_{1} & \cdots & \mathbf{q}_{r} & \mathbf{q}_{r+1} & \cdots & \mathbf{q}_{m} \end{bmatrix} \\ =& \begin{bmatrix} P \mathbf{q}_{1} & \cdots & P \mathbf{q}_{r} & P \mathbf{q}_{r+1} & \cdots & P \mathbf{q}_{m} \end{bmatrix} \\ &= \begin{bmatrix} \mathbf{q}_{1} & \cdots & \mathbf{q}_{r} & \mathbb{0} & \cdots & \mathbb{0} \end{bmatrix} \end{align*}

For convenience, if we let Q^:=[q1qr]\widehat{Q} := \begin{bmatrix} \mathbf{q}_{1} & \cdots & \mathbf{q}_{r} \end{bmatrix} be PQ=[Q^O]PQ = \begin{bmatrix} \widehat{Q} & O \end{bmatrix}, the equation can be represented as PQ=[Q^O]PQ = \begin{bmatrix} \widehat{Q} & O \end{bmatrix}. When QQ^{\ast} is multiplied with the equation obtained above

QPQ=[Q^qr+1qm][Q^O]=[Q^Q^OOO]=[IrOOO] \begin{align*} Q^{\ast} P Q =& \begin{bmatrix} \widehat{Q}^{\ast} \\ \mathbf{q}_{r+1} \\ \vdots \\ \mathbf{q}_{m} \end{bmatrix} \begin{bmatrix} \widehat{Q} & O \end{bmatrix} \\ =& \begin{bmatrix} \widehat{Q}^{\ast} \widehat{Q} & O \\ O & O \end{bmatrix} \\ =& \begin{bmatrix} I_{r} & O \\ O & O \end{bmatrix} \end{align*}

By arranging it with respect to PP

P=Q[IrOOO]Q P = Q \begin{bmatrix} I_{r} & O \\ O & O \end{bmatrix} Q^{\ast}

and

P=(Q[IrOOO]Q)=Q[IrOOO]Q=P P^{\ast} = \left( Q \begin{bmatrix} I_{r} & O \\ O & O \end{bmatrix} Q^{\ast} \right)^{\ast} = Q \begin{bmatrix} I_{r} & O \\ O & O \end{bmatrix} Q^{\ast} = P

Therefore, PP is a Hermitian matrix.

()(\Longleftarrow)

From Cm=C(P)N(P)\mathbb{C}^{m } = \mathcal{C} (P) \oplus \mathcal{N} (P) it follows that N(P)=C(IP)\mathcal{N} (P) = \mathcal{C} (I-P). When the inner product of the two vectors PxC(P)P \mathbf{x} \in \mathcal{C} (P) and (IP)yC(IP)(I - P) \mathbf{y} \in \mathcal{C} (I - P) is calculated

(Px)(IP)y=xP(IP)y=xP(IP)y=x(PP2)y=x(PP)y=0 \begin{align*} ( P \mathbf{x} )^{\ast} (I - P) \mathbf{y} =& \mathbf{x}^{\ast} P^{\ast} ( I - P ) \mathbf{y} \\ =& \mathbf{x}^{\ast} P ( I - P ) \mathbf{y} \\ =& \mathbf{x}^{\ast} ( P - P^2 ) \mathbf{y} \\ =& \mathbf{x}^{\ast} ( P - P ) \mathbf{y} \\ =& \mathbb{0} \end{align*}

Thus

C(P)=C(IP)=N(P) \mathcal{C} (P) = \mathcal{C} (I-P)^{\perp} = \mathcal{N} (P)^{\perp}