logo

Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution 📂Probability Distribution

Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution

Theorem 1

$$ \begin{align*} \mathbf{X} =& \begin{bmatrix} \mathbf{X}_{1} \\ \mathbf{X}_{2} \end{bmatrix} & : \Omega \to \mathbb{R}^{n} \\ \mu =& \begin{bmatrix} \mu_{1} \\ \mu_{2} \end{bmatrix} & \in \mathbb{R}^{n} \\ \Sigma =& \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} & \in \mathbb{R}^{n \times n} \end{align*} $$ Let’s assume that a random vector $\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right)$, which follows a multivariate normal distribution, is given as shown in Jordan block form for $\mathbf{X}$, $\mu$, $\Sigma$. Then the following holds. $$ \mathbf{X}_{1} \perp \mathbf{X}_{2} \iff \Sigma_{12} = \Sigma_{21} = O $$


  • $\perp$ represents the independence of random variables.
  • $O$ represents a zero matrix.

Description

$\Sigma_{12} = \Sigma_{21} = O$ means that the covariance between two random vectors $\mathbf{X}_{1}$ and $\mathbf{X}_{2}$ is $0$.

Generally, just because there is no correlation doesn’t mean they are independent, but the condition for them to be equivalent is that each follows a normal distribution. It’s a fact that everyone knows, but surprisingly, not many have actually proved it themselves.

Proof

$\left( \implies \right)$

Let’s say indices belonging to $\mathbf{X}_{1}$ are $i$ and those belonging to $\mathbf{X}_{2}$ are $j$ for all $i \ne j$ such that $X_{i} \perp X_{j}$. $$ \begin{align*} & \operatorname{Cov} \left( X_{i} , X_{j} \right) \\ =& E \left( X_{i} - \mu_{i} \right) \left( X_{j} - \mu_{j} \right) \\ =& E \left( X_{i} - \mu_{i} \right) E \left( X_{j} - \mu_{j} \right) \\ =& 0 \cdot 0 \end{align*} $$


$\left( \impliedby \right)$

Let $\Sigma_{12} = \Sigma_{12}^{T} = \Sigma_{21} = O$ be assumed.

Marginal random vector of a multivariate normal distribution: If it is $\mathbf{X} \sim N_{n} \left( \mu, \Sigma \right)$, then one of those marginal random vectors $X_{1}$ follows the multivariate normal distribution $N_{m} \left( \mu_{1} , \Sigma_{11} \right)$.

$\mathbf{X}_{1}$ and $\mathbf{X}_{2}$, being marginal random vectors of $\mathbf{X}$, each follow the multivariate normal distributions $N_{m} \left( \mu_{1} , \Sigma_{11} \right)$ and $N_{n-m} \left( \mu_{2} , \Sigma_{22} \right)$, respectively. Thus, their moment generating functions $M_{\mathbf{X}_{1}}$ and $M_{\mathbf{X}_{2}}$ are as follows with respect to $\mathbf{t}_{1} \in \mathbb{R}^{m}$ and $\mathbf{t}_{2} \in \mathbb{R}^{n-m}$. $$ \begin{align*} M_{\mathbf{X}_{1}} \left( \mathbf{t}_{1} \right) =& \exp \left[ \mathbf{t}_{1}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{1}^{T} \Sigma \mathbf{t}_{1} \right] \\ M_{\mathbf{X}_{2}} \left( \mathbf{t}_{2} \right) =& \exp \left[ \mathbf{t}_{2}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{2}^{T} \Sigma \mathbf{t}_{2} \right] \end{align*} $$

Moment generating function of a multivariate normal distribution: The moment generating function of $X \sim N_{p} \left( \mu , \Sigma \right)$ is as follows. $$ M_{X} \left( \mathbf{t} \right) = \exp \left( \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right) \qquad , \mathbf{t} \in \mathbb{R}^{p} $$

The moment generating function of $\mathbf{X}$ is expressed as the product of the moment generating functions of $\mathbf{X}_{1}$ and $\mathbf{X}_{2}$, $M_{\mathbf{X}_{1}}$ and $M_{\mathbf{X}_{2}}$, respectively, where $\mathbf{t} \in \mathbb{R}^{n}$ is $\mathbf{t} = \begin{bmatrix} \mathbf{t}_{1} \\ \mathbf{t}_{2} \end{bmatrix}$. $$ \begin{align*} & M_{\mathbf{X}} \left( \mathbf{t} \right) \\ =& \exp \left[ \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu_{1} + \mathbf{t}_{2}^{T} \mu_{2} + {{ 1 } \over { 2 }} \left( \mathbf{t}_{1}^{T} \Sigma_{11} \mathbf{t}_{1} + \mathbf{t}_{1}^{T} \Sigma_{12} \mathbf{t}_{2} + \mathbf{t}_{2}^{T} \Sigma_{21} \mathbf{t}_{1} + \mathbf{t}_{2}^{T} \Sigma_{22} \mathbf{t}_{2} \right) \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu_{1} + \mathbf{t}_{2}^{T} \mu_{2} + {{ 1 } \over { 2 }} \left( \mathbf{t}_{1}^{T} \Sigma_{11} \mathbf{t}_{1} + 0 + 0 + \mathbf{t}_{2}^{T} \Sigma_{22} \mathbf{t}_{2} \right) \right] \\ =& \exp \left[ \mathbf{t}_{1}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{1}^{T} \Sigma \mathbf{t}_{1} \right] \exp \left[ \mathbf{t}_{2}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}_{2}^{T} \Sigma \mathbf{t}_{2} \right] \\ =& M_{\mathbf{X}_{1}} \left( \mathbf{t}_{1} \right) M_{\mathbf{X}_{2}} \left( \mathbf{t}_{2} \right) \end{align*} $$


  1. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p184~185. ↩︎