Normal Distribution
Definition
A continuous probability distribution $N \left( \mu,\sigma^{2} \right)$ with a probability density function as follows, given mean $\mu \in \mathbb{R}$ and variance $\sigma^{2} > 0$, is called Normal Distribution.
$$ f(x) = {{ 1 } \over { \sqrt{2 \pi} \sigma }} \exp \left[ - {{ 1 } \over { 2 }} \left( {{ x - \mu } \over { \sigma }} \right)^{2} \right] \qquad, x \in \mathbb{R} $$
In particular, a Standard Normal Distribution is defined by the following probability density function $N \left( 0,1^{2} \right)$.
$$ f(z) = {{ 1 } \over { \sqrt{2 \pi} }} \exp \left[ - {{ z^{2} } \over { 2 }} \right] $$
Description
Another name for the normal distribution is Gaussian Distribution. Historically, it became well known when Gauss introduced the normal distribution in his studies on the method of least squares in 1809. Though it’s not definitive that Gauss was the first to realize the essence of normal distribution, he certainly deserves the nickname associated with it.
In 1794, at the mere age of seventeen, Gauss was inspired about a method to determine the true value from measurements encountered in daily life or research. Counting his steps along a frequently traveled path, he collected data and plotted it on a graph to obtain a bell-shaped curve. This discovery was made in an era before the concept of histograms existed. Gauss himself thought these concepts of normal distribution and the method of least squares were already widely known and commonly used technologies1. Truly an overwhelming display of genius. Additionally, Gaussian integrals are often used in numerous calculations related to normal distribution.
Since then, the normal distribution has been extensively studied and has become an indispensable tool across all sciences. It’s so familiar that laypeople often mistakenly believe that statistics simply assumes data follows a normal distribution and then calculates the mean and variance to conclude. While it would be disappointing if such undervaluation led to pursuing a degree in statistics, maybe that level of explanation is sufficient for non-specialists. It’s a statement on how important and powerful the normal distribution is.
Basic Properties
Moment Generating Function
- [1]: $$m(t) = \exp \left( \mu t + {{ \sigma^{2} t^{2} } \over { 2 }} \right) \qquad , t \in \mathbb{R}$$
Mean and Variance
- [2] : If $X \sim N\left( \mu , \sigma^{2} \right)$ then $$ \begin{align*} E(X) =& \mu \\ \operatorname{Var} (X) =& \sigma^{2} \end{align*} $$
Sufficient Statistics and Maximum Likelihood Estimator
- [3] : Given a random sample $\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim N \left( \mu , \sigma^{2} \right)$ following the normal distribution.
The sufficient statistics $T$ and maximum likelihood estimator $\left( \hat{\mu}, \widehat{\sigma^{2}} \right)$ are as follows. $$ \begin{align*} T =& \left( \sum_{k} X_{k}, \sum_{k} X_{k}^{2} \right) \\ \left( \hat{\mu}, \widehat{\sigma^{2}} \right) =& \left( {{ 1 } \over { n }} \sum_{k} X_{k}, {{ 1 } \over { n }} \sum_{k} \left( X_{k} - \overline{X} \right)^{2} \right) \end{align*} $$
Entropy
- [4] : When choosing the natural logarithm, the entropy of the normal distribution is as follows. $$ H = \ln \sqrt{2\pi e \sigma^{2}} $$
Theorems
The importance of the normal distribution can be succinctly listed without need for lengthy explanations. Observe the following:
Central Limit Theorem
- [a]: If $\left\{ X_{k} \right\}_{k=1}^{n}$ are iid random variables with probability distribution $\left( \mu, \sigma^2 \right) $, then $$ \sqrt{n} {{ \overline{X}_n - \mu } \over {\sigma}} \overset{D}{\to} N (0,1) $$
Relationship with Chi-Squared Distribution
- [b]: If $X \sim N(\mu,\sigma ^2)$ then $$ V=\left( { X - \mu \over \sigma} \right) ^2 \sim \chi ^2 (1) $$
Derivation of Standard Normal Distribution as Limiting Distribution of Binomial Distribution
- [c]: If $X_i \sim B(1,p)$ and $Y_n = X_1 + X_2 + \cdots + X_n$, then $Y_n \sim B(n,p)$ $$ { { Y_n - np } \over {\sqrt{ np(1-p) } } }\overset{D}{\to} N(0,1) $$
Derivation of Standard Normal Distribution as Limiting Distribution of Poisson Distribution
- [d]: If $X_{n} \sim \text{Poi} \left( n \right)$ and $\displaystyle Y_{n} := {{ X_{n} - n } \over { \sqrt{n} }}$, then $$ Y_{n} \overset{D}{\to} N(0,1) $$
Derivation of Standard Normal Distribution as Limiting Distribution of Student’s t-Distribution
- [e]: If $T_n \sim t(n)$, then $$ T_n \ \overset{D}{\to} N(0,1) $$
Derivation of t-Distribution from Normal and Chi-Squared Distributions
- [f]: Given two random variables $W,V$ are independent and $W \sim N(0,1)$, $V \sim \chi^{2} (r)$, then $$ T = { {W} \over {\sqrt{V/r} } } \sim t(r) $$
Proofs
Strategy: Start by deriving the moment generating function of the standard normal distribution by making the exponent a complete square to enable the use of Gaussian integration, then obtain the moment generating function of the normal distribution through substitution.
Gaussian Integration: $$ \int_{-\infty}^{\infty} e^{-x^2} dx= \sqrt{\pi} $$
[1] 2
Given $\displaystyle Z := {{ X - \mu } \over { \sigma }} \sim N(0,1)$, its moment generating function is
$$ \begin{align*} m_{Z}(t) =& \int_{-\infty}^{\infty} \exp (tz) {{ 1 } \over { \sqrt{2 \pi} }} \exp \left[ - {{ 1 } \over { 2 }} z^{2} \right] dz \\ =& {{ 1 } \over { \sqrt{\pi} }} \int_{-\infty}^{\infty} {{ 1 } \over { \sqrt{2} }} \exp \left[ - {{ 1 } \over { 2 }} z^{2} + tz \right] dz \\ =& {{ 1 } \over { \sqrt{\pi} }} \int_{-\infty}^{\infty} {{ 1 } \over { \sqrt{2} }} \exp \left[ - {{ 1 } \over { 2 }} \left( z - t \right)^{2} + {{ t^{2} } \over { 2 }} \right] dz \\ =& {{ 1 } \over { \sqrt{\pi} }} \int_{-\infty}^{\infty} {{ 1 } \over { \sqrt{2} }} \exp \left[ - {{ 1 } \over { 2 }} \left( z - t \right)^{2} \right] \exp \left[ {{ t^{2} } \over { 2 }} \right] dz \\ =& \exp \left[ {{ t^{2} } \over { 2 }} \right] {{ 1 } \over { \sqrt{\pi} }} \int_{-\infty}^{\infty} {{ 1 } \over { \sqrt{2} }} \exp \left[ - w^{2} \right] \sqrt{2} dw \\ =& \exp \left[ {{ t^{2} } \over { 2 }} \right] \end{align*} $$
Then, the moment generating function for $X \sim N \left( \mu , \sigma^{2} \right)$ is
$$ \begin{align*} m_{X}(t) =& E \left[ \exp ( t X ) \right] \\ =& E \left[ \exp \left( t (\sigma Z + \mu) \right) \right] \\ =& \exp(\mu t) E \left[ \exp \left( t \sigma Z \right) \right] \\ =& \exp(\mu t) \exp \left( {{ t^{2} \sigma^{2} } \over { 2 }} \right) \\ =& \exp \left( \mu t + {{ \sigma^{2} t^{2} } \over { 2 }} \right) \end{align*} $$
■
[2]
Direct deduction using the moment generating function.
■
[3]
■
[4]
■
[a]
Application of the moment method.
■
[b]
■
[c]
Proven using the central limit theorem.
■
[d]
Proven using the moment generating function.
■
[e]
■
[f]
Simple yet complex. Direct deduction using the probability density function.
■
Code
Below is some Julia code showing probability density functions for Cauchy distribution, t-distribution, and Cauchy distribution.
@time using LaTeXStrings
@time using Distributions
@time using Plots
cd(@__DIR__)
x = -4:0.1:4
plot(x, pdf.(Cauchy(), x),
color = :red,
label = "Cauchy", size = (400,300))
plot!(x, pdf.(TDist(3), x),
color = :orange,
label = "t(3)", size = (400,300))
plot!(x, pdf.(TDist(30), x),
color = :black, linestyle = :dash,
label = "t(30)", size = (400,300))
plot!(x, pdf.(Normal(), x),
color = :black,
label = "Standard Normal", size = (400,300))
xlims!(-4,5); ylims!(0,0.5); title!(L"\mathrm{pdf\,of\, t}(\nu)")
png("pdf")