logo

Inverse and Square Root of Positive Definite Matrices 📂Matrix Algebra

Inverse and Square Root of Positive Definite Matrices

Formulas 1

Let’s say the eigenpairs {(λk,ek)}k=1n\left\{ \left( \lambda_{k} , e_{k} \right) \right\}_{k=1}^{n} of a positive definite matrix AA are arranged in order λ1>>λn>0\lambda_{1} > \cdots > \lambda_{n} > 0. Regarding the orthogonal matrix P=[e1en]Rn×nP = \begin{bmatrix} e_{1} & \cdots & e_{n} \end{bmatrix} \in \mathbb{R}^{n \times n} and the diagonal matrix Λ=diag(λ1,,λn)\Lambda = \diag \left( \lambda_{1} , \cdots , \lambda_{n} \right), the inverse matrix A1A^{-1} and the square root matrix A\sqrt{A} of AA are as follows: A1=PΛ1PT=k=1n1λkekekTA=PΛPT=k=1nλkekekT \begin{align*} A^{-1} =& P \Lambda^{-1} P^{T} = \sum_{k=1}^{n} {{ 1 } \over { \lambda_{k} }} e_{k} e_{k}^{T} \\ \sqrt{A} =& P \sqrt{\Lambda} P^{T} = \sum_{k=1}^{n} \sqrt{\lambda_{k}} e_{k} e_{k}^{T} \end{align*}


Derivation

Spectral Decomposition:

Spectral Theory: It is equivalent that AA is a Hermitian matrix and can be unitarily diagonalized: A=A    A=QΛQ A = A^{\ast} \iff A = Q \Lambda Q^{\ast}

In particular, in statistics, covariance matrices are often positive definite matrices and positive definite matrices are Hermitian matrices. Not just covariance matrices but also with respect to a design matrix XX, XTXX^{T} X is a symmetric matrix, especially if XRm×nX \in \mathbb{R}^{m \times n}, it becomes a Hermitian matrix again. Under these conditions, according to spectral theory, AA can obtain QQ, which consists of orthonormal eigenvectors e1,,ene_{1} , \cdots , e_{n}, restated as follows: A=QΛQ=Q[λ1000λ2000λn][e1e2en]=[e1e2en][λ1e1λ2e2λnen]=λ1e1e1+λ2e2e2++λnenen=k=1nλkekek \begin{align*} & A \\ = & Q \Lambda Q^{\ast} \\ = & Q \begin{bmatrix} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_{2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_{n} \end{bmatrix} \begin{bmatrix} e_{1}^{\ast} \\ e_{2}^{\ast} \\ \vdots \\ e_{n}^{\ast} \end{bmatrix} \\ = & \begin{bmatrix} e_{1} & e_{2} & \cdots & e_{n} \end{bmatrix} \begin{bmatrix} \lambda_{1} e_{1}^{\ast} \\ \lambda_{2} e_{2}^{\ast} \\ \vdots \\ \lambda_{n} e_{n}^{\ast} \end{bmatrix} \\ = & \lambda_{1} e_{1} e_{1}^{\ast} + \lambda_{2} e_{2} e_{2}^{\ast} + \cdots + \lambda_{n} e_{n} e_{n}^{\ast} \\ = & \sum_{k=1}^{n} \lambda_{k} e_{k} e_{k}^{\ast} \end{align*}

Since Λ\Lambda is a diagonal matrix, there’s nothing special to derive. Since PP is an orthogonal matrix, A1=(PΛPT)1=PTΛ1P1=PΛ1PT \begin{align*} A^{-1} =& \left( P \Lambda P^{T} \right)^{-1} \\ =& P^{-T} \Lambda^{-1} P^{-1} \\ =& P \Lambda^{-1} P^{T} \end{align*} and the following verification yields A\sqrt{A} with respect to the identity matrix II: (PΛPT)(PΛPT)=PΛPTPΛPT=PΛIΛPT=PΛΛPT=PΛPT=A \begin{align*} & \left( P \sqrt{\Lambda} P^{T} \right) \left( P \sqrt{\Lambda} P^{T} \right) \\ =& P \sqrt{\Lambda} P^{T} P \sqrt{\Lambda} P^{T} \\ =& P \sqrt{\Lambda} I \sqrt{\Lambda} P^{T} \\ =& P \sqrt{\Lambda} \sqrt{\Lambda} P^{T} \\ =& P \Lambda P^{T} \\ =& A \end{align*}


  1. Johnson. (2013). Applied Multivariate Statistical Analysis(6th Edition): p104. ↩︎