logo

Sufficient Statistics and Maximum Likelihood Estimators of a Normal Distribution 📂Probability Distribution

Sufficient Statistics and Maximum Likelihood Estimators of a Normal Distribution

Theorem

A given random sample X:=(X1,,Xn)N(μ,σ2)\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim N \left( \mu , \sigma^{2} \right) follows a normal distribution.

The sufficient statistic TT and maximum likelihood estimator (μ^,σ2^)\left( \hat{\mu}, \widehat{\sigma^{2}} \right) for (μ,σ2)\left( \mu, \sigma^{2} \right) are as follows: T=(kXk,kXk2)(μ^,σ2^)=(1nkXk,1nk(XkX)2) \begin{align*} T =& \left( \sum_{k} X_{k}, \sum_{k} X_{k}^{2} \right) \\ \left( \hat{\mu}, \widehat{\sigma^{2}} \right) =& \left( {{ 1 } \over { n }} \sum_{k} X_{k}, {{ 1 } \over { n }} \sum_{k} \left( X_{k} - \overline{X} \right)^{2} \right) \end{align*}

Proof

Sufficient Statistic

f(x;λ)=k=1nf(xk;λ)=k=1n12πσexp[12(xkμσ)2]=12πnσnexp[12σ2k=1nxk2]exp[1σ2k=1nμxk]exp[12σ2nμ2]=μexp[μσ2k=1nxk12σ2nμ2]12πnσnexp[12σ2k=1nxk2]=σ12πnσnexp[12σ2k=1nxk2]exp[1σ2k=1nμxk]exp[12σ2nμ2]1 \begin{align*} f \left( \mathbf{x} ; \lambda \right) =& \prod_{k=1}^{n} f \left( x_{k} ; \lambda \right) \\ =& \prod_{k=1}^{n} {{ 1 } \over { \sqrt{2 \pi} \sigma }} \exp \left[ - {{ 1 } \over { 2 }} \left( {{ x_{k} - \mu } \over { \sigma }} \right)^{2} \right] \\ =& {{ 1 } \over { \sqrt{2 \pi}^{n} \sigma^{n} }} \exp \left[ - {{ 1 } \over { 2 \sigma^{2} }} \sum_{k=1}^{n} x_{k}^{2} \right] \exp \left[ {{ 1 } \over { \sigma^{2} }} \sum_{k=1}^{n} \mu x_{k} \right] \exp \left[ - {{ 1 } \over { 2 \sigma^{2} }} n \mu^{2} \right] \\ \overset{\mu}{=}& \exp \left[ {{ \mu } \over { \sigma^{2} }} \sum_{k=1}^{n} x_{k} - {{ 1 } \over { 2 \sigma^{2} }} n \mu^{2} \right] \cdot {{ 1 } \over { \sqrt{2 \pi}^{n} \sigma^{n} }} \exp \left[ - {{ 1 } \over { 2 \sigma^{2} }} \sum_{k=1}^{n} x_{k}^{2} \right] \\ \overset{\sigma}{=}& {{ 1 } \over { \sqrt{2 \pi}^{n} \sigma^{n} }} \exp \left[ - {{ 1 } \over { 2 \sigma^{2} }} \sum_{k=1}^{n} x_{k}^{2} \right] \exp \left[ {{ 1 } \over { \sigma^{2} }} \sum_{k=1}^{n} \mu x_{k} \right] \exp \left[ - {{ 1 } \over { 2 \sigma^{2} }} n \mu^{2} \right] \cdot 1 \end{align*}

Neyman Factorization Theorem: Consider a random sample X1,,XnX_{1} , \cdots , X_{n} that has the same probability mass/density function f(x;θ)f \left( x ; \theta \right) for a parameter θΘ\theta \in \Theta. A statistic Y=u1(X1,,Xn)Y = u_{1} \left( X_{1} , \cdots , X_{n} \right) is a sufficient statistic θ\theta if there exist two non-negative functions k1,k20k_{1} , k_{2} \ge 0 that satisfy the following: f(x1;θ)f(xn;θ)=k1[u1(x1,,xn);θ]k2(x1,,xn) f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) = k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] k_{2} \left( x_{1} , \cdots , x_{n} \right) Note that k2k_{2} should not depend on θ\theta.

According to the Neyman Factorization Theorem, T:=(kXk,kXk2)T := \left( \sum_{k} X_{k}, \sum_{k} X_{k}^{2} \right) is a sufficient statistic for (μ,σ2)\left( \mu, \sigma^{2} \right).

Maximum Likelihood Estimator

logL(μ,σ2;x)=logf(x;μ,σ2)=nlogσ2π12σ2k=1nxk2+1σ2k=1nμxk12σ2nμ2 \begin{align*} \log L \left( \mu, \sigma^{2} ; \mathbf{x} \right) =& \log f \left( \mathbf{x} ; \mu, \sigma^{2} \right) \\ =& - n \log \sigma \sqrt{2 \pi} - {{ 1 } \over { 2 \sigma^{2} }} \sum_{k=1}^{n} x_{k}^{2} + {{ 1 } \over { \sigma^{2} }} \sum_{k=1}^{n} \mu x_{k} - {{ 1 } \over { 2 \sigma^{2} }} n \mu^{2} \end{align*}

The log-likelihood function of the random sample is as shown above, and for the likelihood function to reach its maximum value, the partial derivative with respect to μ,σ\mu, \sigma has to be 00. Initially, to have the partial derivative with respect to μ\mu be 00, 0=1σ2k=1nxk1σ2nμ    μ=1nk=1nxk \begin{align*} & 0 = {{ 1 } \over { \sigma^{2} }} \sum_{k=1}^{n} x_{k} - {{ 1 } \over { \sigma^{2} }} n \mu \\ \implies & \mu = {{ 1 } \over { n }} \sum_{k=1}^{n} x_{k} \end{align*}

thus, regardless of σ\sigma, μ^=k=1nXk/n\hat{\mu} = \sum_{k=1}^{n} X_{k} / n holds, and to have the partial derivative with respect to σ\sigma be 00, 0=nσ+1σ3k=1nxk22σ3k=1nμxk+1σ3nμ2    nσ2=k=1nxk22k=1nμxk+nμ2    σ2=1nk=1n(xkμ)2 \begin{align*} & 0 = - {{ n } \over { \sigma }} + {{ 1 } \over { \sigma^{3} }} \sum_{k=1}^{n} x_{k}^{2} - {{ 2 } \over { \sigma^{3} }} \sum_{k=1}^{n} \mu x_{k} + {{ 1 } \over { \sigma^{3} }} n \mu^{2} \\ \implies & n \sigma^{2} = \sum_{k=1}^{n} x_{k}^{2} - 2 \sum_{k=1}^{n} \mu x_{k} + n \mu^{2} \\ \implies & \sigma^{2} = {{ 1 } \over { n }} \sum_{k=1}^{n} \left( x_{k} - \mu \right)^{2} \end{align*}

thus, for μ^=μ^=k=1nXk/n\hat{\mu} = \hat{\mu} = \sum_{k=1}^{n} X_{k} / n, σ2^=k=1n(Xkμ)2/n\widehat{\sigma^{2}} = \sum_{k=1}^{n} \left( X_{k} - \mu \right)^{2} / n holds.