logo

Covariance Matrix 📂Mathematical Statistics

Covariance Matrix

Definition1

pp-dimensional random vector X=(X1,,Xp)\mathbf{X} = \left( X_{1}, \cdots , X_{p} \right) is defined as follows for Cov(X)\operatorname{Cov} (\mathbf{X}), which is called a Covariance Matrix.

(Cov(X))ij:=Cov(Xi,Xj) \left( \operatorname{Cov} \left( \mathbf{X} \right) \right)_{ij} := \operatorname{Cov} \left( X_{i} , X_{j} \right)


Explanation

To put the definition in simpler words, it is as follows.

Cov(X):=(Var(X1)Cov(X1,X2)Cov(X1,Xp)Cov(X2,X1)Var(X2)Cov(X2,Xp)Cov(Xp,X1)Cov(Xp,X2)Var(Xp)) \operatorname{Cov} \left( \mathbf{X} \right) := \begin{pmatrix} \operatorname{Var} \left( X_{1} \right) & \operatorname{Cov} \left( X_{1} , X_{2} \right) & \cdots & \operatorname{Cov} \left( X_{1} , X_{p} \right) \\ \operatorname{Cov} \left( X_{2} , X_{1} \right) & \operatorname{Var} \left( X_{2} \right) & \cdots & \operatorname{Cov} \left( X_{2} , X_{p} \right) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov} \left( X_{p} , X_{1} \right) & \operatorname{Cov} \left( X_{p} , X_{2} \right) & \cdots & \operatorname{Var} \left( X_{p} \right) \end{pmatrix}

All covariance matrices are positive semi-definite matrices. In other words, for all vectors xRp\mathbf{x} \in \mathbb{R}^{p}, the following holds true.

0xTCov(X)x 0 \le \textbf{x}^{T} \operatorname{Cov} \left( \mathbf{X} \right) \textbf{x}

Theorems

  • [1]: If μRp\mathbf{\mu} \in \mathbb{R}^{p} is given as μ:=(EX1,,EXp)\mathbf{\mu} := \left( EX_{1} , \cdots , EX_{p} \right), Cov(X)=E[XXT]μμT \operatorname{Cov} (\mathbf{X}) = E \left[ \mathbf{X} \mathbf{X}^{T} \right] - \mathbf{\mu} \mathbf{\mu}^{T}
  • [2]: If a matrix of constants ARk×pA \in \mathbb{R}^{k \times p} is given as (A)ij:=aij(A)_{ij} := a_{ij}, Cov(AX)=ACov(X)AT \operatorname{Cov} ( A \mathbf{X}) = A \operatorname{Cov} \left( \mathbf{X} \right) A^{T}

Proof

[1]

Cov(X)=E[(Xμ)(Xμ)T]=E[XXTμXTXμT+μμT]=E[XXT]μE[XT]E[X]μT+E[μμT]=E[XXT]μμT \begin{align*} \operatorname{Cov} \left( \mathbf{X} \right) =& E \left[ \left( \mathbf{X} - \mathbf{\mu} \right) \left( \mathbf{X} - \mathbf{\mu} \right)^{T} \right] \\ =& E \left[ \mathbf{X} \mathbf{X}^{T} - \mathbf{\mu} \mathbf{X}^{T} - \mathbf{X} \mathbf{\mu}^{T} + \mathbf{\mu} \mathbf{\mu}^{T} \right] \\ =& E \left[ \mathbf{X} \mathbf{X}^{T} \right] - \mathbf{\mu} E \left[ \mathbf{X}^{T} \right] - E \left[ \mathbf{X} \right] \mathbf{\mu}^{T} + E \left[ \mathbf{\mu} \mathbf{\mu}^{T} \right] \\ =& E \left[ \mathbf{X} \mathbf{X}^{T} \right] - \mathbf{\mu} \mathbf{\mu}^{T} \end{align*}

[2] 2

Cov(AX)=E[(AXAμ)(AXAμ)T]=E[A(Xμ)(Xμ)TAT]=AE[(Xμ)(Xμ)T]AT=ACov(X)AT \begin{align*} \operatorname{Cov} \left( A \mathbf{X} \right) =& E \left[ \left( A\mathbf{X} - A\mathbf{\mu} \right) \left( A\mathbf{X} - A\mathbf{\mu} \right)^{T} \right] \\ =& E \left[ A\left(\mathbf{X} -\mathbf{\mu} \right) \left( \mathbf{X} - \mathbf{\mu} \right)^{T} A^{T} \right] \\ =& A E \left[ \left(\mathbf{X} -\mathbf{\mu} \right) \left( \mathbf{X} - \mathbf{\mu} \right)^{T}\right] A^{T} \\ =& A \operatorname{Cov}\left( \mathbf{X} \right) A^{T} \end{align*}


  1. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p126. ↩︎

  2. https://stats.stackexchange.com/a/106207/172321 ↩︎