logo

Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution 📂Probability Distribution

Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution

Theorem 1

X=[X1X2]:ΩRnμ=[μ1μ2]RnΣ=[Σ11Σ12Σ21Σ22]Rn×n \begin{align*} \mathbf{X} =& \begin{bmatrix} \mathbf{X}_{1} \\ \mathbf{X}_{2} \end{bmatrix} & : \Omega \to \mathbb{R}^{n} \\ \mu =& \begin{bmatrix} \mu_{1} \\ \mu_{2} \end{bmatrix} & \in \mathbb{R}^{n} \\ \Sigma =& \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} & \in \mathbb{R}^{n \times n} \end{align*} Let’s assume that a random vector XNn(μ,Σ)\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right), which follows a multivariate normal distribution, is given as shown in Jordan block form for X\mathbf{X}, μ\mu, Σ\Sigma. Then the following holds. X1X2    Σ12=Σ21=O \mathbf{X}_{1} \perp \mathbf{X}_{2} \iff \Sigma_{12} = \Sigma_{21} = O


  • \perp represents the independence of random variables.
  • OO represents a zero matrix.

Description

Σ12=Σ21=O\Sigma_{12} = \Sigma_{21} = O means that the covariance between two random vectors X1\mathbf{X}_{1} and X2\mathbf{X}_{2} is 00.

Generally, just because there is no correlation doesn’t mean they are independent, but the condition for them to be equivalent is that each follows a normal distribution. It’s a fact that everyone knows, but surprisingly, not many have actually proved it themselves.

Proof

(    )\left( \implies \right)

Let’s say indices belonging to X1\mathbf{X}_{1} are ii and those belonging to X2\mathbf{X}_{2} are jj for all iji \ne j such that XiXjX_{i} \perp X_{j}. Cov(Xi,Xj)=E(Xiμi)(Xjμj)=E(Xiμi)E(Xjμj)=00 \begin{align*} & \operatorname{Cov} \left( X_{i} , X_{j} \right) \\ =& E \left( X_{i} - \mu_{i} \right) \left( X_{j} - \mu_{j} \right) \\ =& E \left( X_{i} - \mu_{i} \right) E \left( X_{j} - \mu_{j} \right) \\ =& 0 \cdot 0 \end{align*}


(    )\left( \impliedby \right)

Let Σ12=Σ12T=Σ21=O\Sigma_{12} = \Sigma_{12}^{T} = \Sigma_{21} = O be assumed.

Marginal random vector of a multivariate normal distribution: If it is XNn(μ,Σ)\mathbf{X} \sim N_{n} \left( \mu, \Sigma \right), then one of those marginal random vectors X1X_{1} follows the multivariate normal distribution Nm(μ1,Σ11)N_{m} \left( \mu_{1} , \Sigma_{11} \right).

X1\mathbf{X}_{1} and X2\mathbf{X}_{2}, being marginal random vectors of X\mathbf{X}, each follow the multivariate normal distributions Nm(μ1,Σ11)N_{m} \left( \mu_{1} , \Sigma_{11} \right) and Nnm(μ2,Σ22)N_{n-m} \left( \mu_{2} , \Sigma_{22} \right), respectively. Thus, their moment generating functions MX1M_{\mathbf{X}_{1}} and MX2M_{\mathbf{X}_{2}} are as follows with respect to t1Rm\mathbf{t}_{1} \in \mathbb{R}^{m} and t2Rnm\mathbf{t}_{2} \in \mathbb{R}^{n-m}. MX1(t1)=exp[t1Tμ+12t1TΣt1]MX2(t2)=exp[t2Tμ+12t2TΣt2] \begin{align*} M_{\mathbf{X}_{1}} \left( \mathbf{t}_{1} \right) =& \exp \left[ \mathbf{t}_{1}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{1}^{T} \Sigma \mathbf{t}_{1} \right] \\ M_{\mathbf{X}_{2}} \left( \mathbf{t}_{2} \right) =& \exp \left[ \mathbf{t}_{2}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{2}^{T} \Sigma \mathbf{t}_{2} \right] \end{align*}

Moment generating function of a multivariate normal distribution: The moment generating function of XNp(μ,Σ)X \sim N_{p} \left( \mu , \Sigma \right) is as follows. MX(t)=exp(tTμ+12tTΣt),tRp M_{X} \left( \mathbf{t} \right) = \exp \left( \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right) \qquad , \mathbf{t} \in \mathbb{R}^{p}

The moment generating function of X\mathbf{X} is expressed as the product of the moment generating functions of X1\mathbf{X}_{1} and X2\mathbf{X}_{2}, MX1M_{\mathbf{X}_{1}} and MX2M_{\mathbf{X}_{2}}, respectively, where tRn\mathbf{t} \in \mathbb{R}^{n} is t=[t1t2]\mathbf{t} = \begin{bmatrix} \mathbf{t}_{1} \\ \mathbf{t}_{2} \end{bmatrix}. MX(t)=exp[tTμ+12tTΣt]=exp[t1Tμ1+t2Tμ2+12(t1TΣ11t1+t1TΣ12t2+t2TΣ21t1+t2TΣ22t2)]=exp[t1Tμ1+t2Tμ2+12(t1TΣ11t1+0+0+t2TΣ22t2)]=exp[t1Tμ+12t1TΣt1]exp[t2Tμ+12t2TΣt2]=MX1(t1)MX2(t2) \begin{align*} & M_{\mathbf{X}} \left( \mathbf{t} \right) \\ =& \exp \left[ \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu_{1} + \mathbf{t}_{2}^{T} \mu_{2} + {{ 1 } \over { 2 }} \left( \mathbf{t}_{1}^{T} \Sigma_{11} \mathbf{t}_{1} + \mathbf{t}_{1}^{T} \Sigma_{12} \mathbf{t}_{2} + \mathbf{t}_{2}^{T} \Sigma_{21} \mathbf{t}_{1} + \mathbf{t}_{2}^{T} \Sigma_{22} \mathbf{t}_{2} \right) \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu_{1} + \mathbf{t}_{2}^{T} \mu_{2} + {{ 1 } \over { 2 }} \left( \mathbf{t}_{1}^{T} \Sigma_{11} \mathbf{t}_{1} + 0 + 0 + \mathbf{t}_{2}^{T} \Sigma_{22} \mathbf{t}_{2} \right) \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{1}^{T} \Sigma \mathbf{t}_{1} \right] \exp \left[ \mathbf{t}_{2}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{2}^{T} \Sigma \mathbf{t}_{2} \right] \\ =& M_{\mathbf{X}_{1}} \left( \mathbf{t}_{1} \right) M_{\mathbf{X}_{2}} \left( \mathbf{t}_{2} \right) \end{align*}


  1. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p184~185. ↩︎