logo

Multivariate Normal Distribution 📂Probability Distribution

Multivariate Normal Distribution

Definition

The multivariate distribution Np(μ,Σ)N_{p} \left( \mu , \Sigma \right) with the following probability density function, given the population mean vector μRp\mathbf{\mu} \in \mathbb{R}^{p} and the covariance matrix ΣRp×p\Sigma \in \mathbb{R}^{p \times p}, is called the multivariate normal distribution.

f(x)=((2π)pdetΣ)1/2exp[12(xμ)TΣ1(xμ)],xRp f (\textbf{x}) = \left( (2\pi)^{p} \det \Sigma \right)^{-1/2} \exp \left[ - {{ 1 } \over { 2 }} \left( \textbf{x} - \mathbf{\mu} \right)^{T} \Sigma^{-1} \left( \textbf{x} - \mathbf{\mu} \right) \right] \qquad , \textbf{x} \in \mathbb{R}^{p}


  • xT\mathbf{x}^{T} denotes the transpose of x\mathbf{x}.

Theorem

X=[X1X2]:ΩRnμ=[μ1μ2]RnΣ=[Σ11Σ12Σ21Σ22]Rn×n \begin{align*} \mathbf{X} =& \begin{bmatrix} \mathbf{X}_{1} \\ \mathbf{X}_{2} \end{bmatrix} & : \Omega \to \mathbb{R}^{n} \\ \mu =& \begin{bmatrix} \mu_{1} \\ \mu_{2} \end{bmatrix} & \in \mathbb{R}^{n} \\ \Sigma =& \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} & \in \mathbb{R}^{n \times n} \end{align*} In the statements of the theorems below, unless otherwise explained, X\mathbf{X}, μ\mu, Σ\Sigma imply the block matrix as described above.

Linear Transformation of Multivariate Normal Distribution

For a matrix ARm×nA \in \mathbb{R}^{m \times n} and a vector bRm\mathbf{b} \in \mathbb{R}^{m}, the linear transformation Y=AX+b\mathbf{Y} = A \mathbf{X} + \mathbf{b} of a random vector XNn(μ,Σ)\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right) following a multivariate normal distribution still follows a multivariate normal distribution Nm(Aμ+b,AΣAT)N_{m} \left( A \mu + \mathbf{b} , A \Sigma A^{T} \right).

Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution

Consider a random vector XNn(μ,Σ)\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right) following a multivariate normal distribution. Then, the following holds: X1X2    Σ12=Σ21=O \mathbf{X}_{1} \perp \mathbf{X}_{2} \iff \Sigma_{12} = \Sigma_{21} = O

Conditional Mean and Variance of Multivariate Normal Distribution

Consider a random vector XNn(μ,Σ)\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right) following a multivariate normal distribution. Then, the conditional probability vector X1X2:ΩRm\mathbf{X}_{1} | \mathbf{X}_{2} : \Omega \to \mathbb{R}^{m} still follows a multivariate normal distribution, and specifically further has the following population mean vector and population covariance matrix:

X1X2Nm(μ1+Σ12Σ221(X2μ2),Σ11Σ12Σ221Σ21) \mathbf{X}_{1} | \mathbf{X}_{2} \sim N_{m} \left( \mu_{1} + \Sigma_{12} \Sigma_{22}^{-1} \left( \mathbf{X}_{2} - \mu_{2} \right) , \Sigma_{11} - \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} \right)

Multivariate Normality of Regression Coefficient Vector

The estimator of regression coefficients β^\hat{\beta} follows the following multivariate normal distribution:

β^N1+p(β,σ2(XTX)1) \hat{\beta} \sim N_{1+p} \left( \beta , \sigma^{2} \left( X^{T} X \right)^{-1} \right)

Moment Generating Function

The moment generating function of XNp(μ,Σ)X \sim N_{p} \left( \mu , \Sigma \right) is as follows:

MX(t)=exp(tTμ+12tTΣt),tRp M_{X} \left( \mathbf{t} \right) = \exp \left( \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right) \qquad , \mathbf{t} \in \mathbb{R}^{p}

Entropy

The entropy of the multivariate normal distribution Np(μ,Σ)N_{p}(\mu, \Sigma) is as follows:

H=12ln[(2πe)pΣ]=12ln(det(2πeΣ)) H = \dfrac{1}{2}\ln \left[ (2 \pi e)^{p} \left| \Sigma \right| \right] = \dfrac{1}{2}\ln (\det (2\pi e \Sigma))

Σ\left| \Sigma \right| is the determinant of the covariance matrix.

Relative Entropy

The relative entropy between two multivariate normal distributions N(μ,Σ)N(\mu, \Sigma) and N(μ1,Σ1)N(\boldsymbol{\mu_{1}}, \Sigma_{1}) is given by the following:

DKL(N(μ,Σ)N(μ1,Σ1))=12[log(ΣΣ1)+Tr(Σ11Σ)+(μμ1)TΣ11(μμ1)k] \begin{array}{l} D_{\text{KL}}\big( N(\mu, \Sigma) \| N(\boldsymbol{\mu_{1}}, \Sigma_{1}) \big) \\[1em] = \dfrac{1}{2} \left[ \log \left( \dfrac{|\Sigma|}{|\Sigma_{1}|} \right) + \Tr(\Sigma_{1}^{-1}\Sigma) + (\mu - \boldsymbol{\mu_{1}})^{\mathsf{T}} \Sigma_{1}^{-1} (\mu - \boldsymbol{\mu_{1}}) - k \right] \end{array}

See Also

  • Univariate Normal Distribution: When p=1p = 1, μR1\mu \in \mathbb{R}^{1}, and ΣR1×1\Sigma \in \mathbb{R}^{1 \times 1}, the above probability density function becomes exactly the probability density function of the univariate normal distribution.