Multivariate Normal Distribution
Definition
The multivariate distribution $N_{p} \left( \mu , \Sigma \right)$ with the following probability density function, given the population mean vector $\mathbf{\mu} \in \mathbb{R}^{p}$ and the covariance matrix $\Sigma \in \mathbb{R}^{p \times p}$, is called the multivariate normal distribution.
$$ f (\textbf{x}) = \left( (2\pi)^{p} \det \Sigma \right)^{-1/2} \exp \left[ - {{ 1 } \over { 2 }} \left( \textbf{x} - \mathbf{\mu} \right)^{T} \Sigma^{-1} \left( \textbf{x} - \mathbf{\mu} \right) \right] \qquad , \textbf{x} \in \mathbb{R}^{p} $$
- $\mathbf{x}^{T}$ denotes the transpose of $\mathbf{x}$.
Theorem
$$ \begin{align*} \mathbf{X} =& \begin{bmatrix} \mathbf{X}_{1} \\ \mathbf{X}_{2} \end{bmatrix} & : \Omega \to \mathbb{R}^{n} \\ \mu =& \begin{bmatrix} \mu_{1} \\ \mu_{2} \end{bmatrix} & \in \mathbb{R}^{n} \\ \Sigma =& \begin{bmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{bmatrix} & \in \mathbb{R}^{n \times n} \end{align*} $$ In the statements of the theorems below, unless otherwise explained, $\mathbf{X}$, $\mu$, $\Sigma$ imply the block matrix as described above.
Linear Transformation of Multivariate Normal Distribution
For a matrix $A \in \mathbb{R}^{m \times n}$ and a vector $\mathbf{b} \in \mathbb{R}^{m}$, the linear transformation $\mathbf{Y} = A \mathbf{X} + \mathbf{b}$ of a random vector $\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right)$ following a multivariate normal distribution still follows a multivariate normal distribution $N_{m} \left( A \mu + \mathbf{b} , A \Sigma A^{T} \right)$.
Independence and Zero Correlation are Equivalent in Multivariate Normal Distribution
Consider a random vector $\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right)$ following a multivariate normal distribution. Then, the following holds: $$ \mathbf{X}_{1} \perp \mathbf{X}_{2} \iff \Sigma_{12} = \Sigma_{21} = O $$
Conditional Mean and Variance of Multivariate Normal Distribution
Consider a random vector $\mathbf{X} \sim N_{n} \left( \mu , \Sigma \right)$ following a multivariate normal distribution. Then, the conditional probability vector $\mathbf{X}_{1} | \mathbf{X}_{2} : \Omega \to \mathbb{R}^{m}$ still follows a multivariate normal distribution, and specifically further has the following population mean vector and population covariance matrix:
$$ \mathbf{X}_{1} | \mathbf{X}_{2} \sim N_{m} \left( \mu_{1} + \Sigma_{12} \Sigma_{22}^{-1} \left( \mathbf{X}_{2} - \mu_{2} \right) , \Sigma_{11} - \Sigma_{12} \Sigma_{22}^{-1} \Sigma_{21} \right) $$
Multivariate Normality of Regression Coefficient Vector
The estimator of regression coefficients $\hat{\beta}$ follows the following multivariate normal distribution:
$$ \hat{\beta} \sim N_{1+p} \left( \beta , \sigma^{2} \left( X^{T} X \right)^{-1} \right) $$
Moment Generating Function
The moment generating function of $X \sim N_{p} \left( \mu , \Sigma \right)$ is as follows:
$$ M_{X} \left( \mathbf{t} \right) = \exp \left( \mathbf{t}^{T} \mu + {{ 1 } \over { 2 }} \mathbf{t}^{T} \Sigma \mathbf{t} \right) \qquad , \mathbf{t} \in \mathbb{R}^{p} $$
Entropy
The entropy of the multivariate normal distribution $N_{p}(\mu, \Sigma)$ is as follows:
$$ H = \dfrac{1}{2}\ln \left[ (2 \pi e)^{p} \left| \Sigma \right| \right] = \dfrac{1}{2}\ln (\det (2\pi e \Sigma)) $$
$\left| \Sigma \right|$ is the determinant of the covariance matrix.
Relative Entropy
The relative entropy between two multivariate normal distributions $N(\mu, \Sigma)$ and $N(\boldsymbol{\mu_{1}}, \Sigma_{1})$ is given by the following:
$$ \begin{array}{l} D_{\text{KL}}\big( N(\mu, \Sigma) \| N(\boldsymbol{\mu_{1}}, \Sigma_{1}) \big) \\[1em] = \dfrac{1}{2} \left[ \log \left( \dfrac{|\Sigma|}{|\Sigma_{1}|} \right) + \Tr(\Sigma_{1}^{-1}\Sigma) + (\mu - \boldsymbol{\mu_{1}})^{\mathsf{T}} \Sigma_{1}^{-1} (\mu - \boldsymbol{\mu_{1}}) - k \right] \end{array} $$
See Also
- Univariate Normal Distribution: When $p = 1$, $\mu \in \mathbb{R}^{1}$, and $\Sigma \in \mathbb{R}^{1 \times 1}$, the above probability density function becomes exactly the probability density function of the univariate normal distribution.