Entropy of Normal Distribution 📂Probability Distribution

Entropy of Normal Distribution

Theorem

The entropy of the normal distribution $N(\mu, \sigma^{2})$ (when using natural logarithms) is as follows.

$H = \dfrac{1}{2} \ln (2\pi e \sigma^{2}) = \ln \sqrt{2\pi e \sigma^{2}}$

The entropy of the multivariate normal distribution $N_{p}(\boldsymbol{\mu}, \Sigma)$ is as follows.

$H = \dfrac{1}{2}\ln \left[ (2 \pi e)^{p} \left| \Sigma \right| \right] = \dfrac{1}{2}\ln (\det (2\pi e \Sigma))$

$\left| \Sigma \right|$ is the determinant of the covariance matrix.

Description

The mean $\mu$ does not affect the entropy. The entropy of the standard normal distribution $N(0,1)$ when using natural logarithms is approximately $H = \ln \sqrt{2\pi e } \approx 1.4189385332046727$ . Even if a log base $2$ is used, the form of the formula remains the same, and its value is,

$H = \log_{2} \sqrt{2\pi e } \approx 2.047095585180641$

Proof

Univariate Normal Distribution

To show this, we use that the integral of $p(x) = \dfrac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left( - \dfrac{(x-\mu)^{2}}{2\sigma^{2}} \right)$ is $1$ .

$\begin{align*} H &= - \int_{-\infty}^{\infty} p(x) \ln p(x) dx \\ &= - \int_{-\infty}^{\infty} p(x) \ln \left[ \dfrac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left( - \dfrac{(x-\mu)^{2}}{2\sigma^{2}} \right) \right] dx \\ &= - \int_{-\infty}^{\infty} p(x) \ln \dfrac{1}{\sqrt{2\pi\sigma^{2}}} dx - \int_{-\infty}^{\infty} p(x) \ln \exp\left( - \dfrac{(x-\mu)^{2}}{2\sigma^{2}} \right) dx \\ &= -\ln \dfrac{1}{\sqrt{2\pi\sigma^{2}}} + \int_{-\infty}^{\infty} p(x) \dfrac{(x-\mu)^{2}}{2\sigma^{2}} dx \\ &= \ln \sqrt{2\pi\sigma^{2}} + \dfrac{1}{2\sigma^{2}}\int_{-\infty}^{\infty} p(x) (x-\mu)^{2} dx \\ &= \ln \sqrt{2\pi\sigma^{2}} + \dfrac{1}{2\sigma^{2}} E[(X-\mu)^{2}] \\ &= \ln \sqrt{2\pi\sigma^{2}} + \dfrac{1}{2\sigma^{2}}\sigma^{2} \\ &= \ln \sqrt{2\pi\sigma^{2}} + \dfrac{1}{2} \\ &= \ln \sqrt{2\pi\sigma^{2}} + \ln \sqrt{e} \\ &= \ln \sqrt{2\pi e \sigma^{2}} \end{align*}$

■

Multivariate Normal Distribution

Since the probability density function of the multivariate normal distribution is $p(\mathbf{x}) = \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \exp \left( -\dfrac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right)$ ,

$\begin{align*} H(p) &= -\int p(\mathbf{x}) \ln(p(\mathbf{x}))d \mathbf{x} \\ &= -\int p(\mathbf{x}) \ln \left[ \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \exp \left( -\dfrac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right) \right] \\ &= -\int p(\mathbf{x}) \ln\left( \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \right)d \mathbf{x} + \dfrac{1}{2}\int p(\mathbf{x}) (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu})d \mathbf{x} \\ &= -\ln\left( \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \right)\int p(\mathbf{x}) d \mathbf{x} + \dfrac{1}{2} E \left[ (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right] \\ &= -\ln\left( \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \right) + \dfrac{1}{2} E \left[ (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right] \end{align*}$

The second term is calculated as follows.

$\begin{align*} E \left[ (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right] &= E \left[ \tr \left( (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right) \right] \\ &= E \left[ \tr \left( \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) (\mathbf{x} - \boldsymbol{\mu})^{T} \right) \right] \\ &= \tr \left[ E \left( \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) (\mathbf{x} - \boldsymbol{\mu})^{T} \right) \right] \\ &= \tr \left[ \Sigma^{-1} E \left( (\mathbf{x} - \boldsymbol{\mu}) (\mathbf{x} - \boldsymbol{\mu})^{T} \right) \right] \\ &= \tr \left[ \Sigma^{-1} \Sigma \right] \\ &= \tr \left[ I_{p\times p} \right] \\ &= p \end{align*}$

The first equality is because of $1 \times 1$ matrix $A$ for $A = \tr(A)$ ,
The second equality is due to the cyclic property of trace,
The third equality is because expectation and trace are exchangeable,
The fourth equality is due to the properties of expectation of a matrix,
The fifth equality is due to the definition of the covariance matrix.

Therefore, the entropy is as follows.

$\begin{align*} H(p) &= -\ln\left( \dfrac{1}{\sqrt{(2\pi)^{p} \left| \Sigma \right|}} \right) + \dfrac{1}{2} E \left[ (\mathbf{x} - \boldsymbol{\mu})^{T} \Sigma^{-1} (\mathbf{x} - \boldsymbol{\mu}) \right] \\ &= \dfrac{1}{2} \ln \left[ (2\pi)^{p} \left| \Sigma \right| \right] + \dfrac{1}{2}p \\ &= \dfrac{1}{2} \ln \left[ (2\pi)^{p} \left| \Sigma \right| \right] + \dfrac{1}{2}\ln e^{p} \\ &= \dfrac{1}{2} \ln \left[ (2\pi e)^{p} \left| \Sigma \right| \right] \end{align*}$

■