logo

Sufficient Statistics and Maximum Likelihood Estimators for the Geometric Distribution 📂Probability Distribution

Sufficient Statistics and Maximum Likelihood Estimators for the Geometric Distribution

Theorem

Given a random sample X:=(X1,,Xn)Geo(p)\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim \text{Geo} \left( p \right) that follows a geometric distribution, the sufficient statistic TT and the maximum likelihood estimator p^\hat{p} for pp are as follows. T=k=1nXkp^=nk=1nXk \begin{align*} T =& \sum_{k=1}^{n} X_{k} \\ \hat{p} =& {{ n } \over { \sum_{k=1}^{n} X_{k} }} \end{align*}

Proof

Sufficient Statistic

f(x;p)=k=1nf(xk;p)=k=1np(1p)xk1=pn(1p)kxkn=pn(1p)kxkn1 \begin{align*} f \left( \mathbf{x} ; p \right) =& \prod_{k=1}^{n} f \left( x_{k} ; p \right) \\ =& \prod_{k=1}^{n} p \left( 1 - p \right)^{x_{k} - 1} \\ =& p^{n} \left( 1 - p \right)^{\sum_{k} x_{k} - n} \\ =& p^{n} \left( 1 - p \right)^{\sum_{k} x_{k} - n} \cdot 1 \end{align*}

Neyman factorization theorem: Suppose a random sample X1,,XnX_{1} , \cdots , X_{n} has the same probability mass/density function f(x;θ)f \left( x ; \theta \right) with respect to the parameter θΘ\theta \in \Theta. A statistic Y=u1(X1,,Xn)Y = u_{1} \left( X_{1} , \cdots , X_{n} \right) being a sufficient statistic for θ\theta implies the existence of two non-negative functions k1,k20k_{1} , k_{2} \ge 0. f(x1;θ)f(xn;θ)=k1[u1(x1,,xn);θ]k2(x1,,xn) f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) = k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] k_{2} \left( x_{1} , \cdots , x_{n} \right) However, k2k_{2} must not depend on θ\theta.

According to the Neyman factorization theorem, T:=kXkT := \sum_{k} X_{k} is a sufficient statistic for pp.

Maximum Likelihood Estimator

logL(p;x)=logf(x;p)=logpn(1p)kxkn=nlogp+k=1nxklog(1p) \begin{align*} \log L \left( p ; \mathbf{x} \right) =& \log f \left( \mathbf{x} ; p \right) \\ =& \log p^{n} \left( 1 - p \right)^{\sum_{k} x_{k} - n} \\ =& n \log p + \sum_{k=1}^{n} x_{k} \log \left( 1 - p \right) \end{align*}

The log-likelihood function of the random sample is as above, and for the likelihood function to be maximized, the partial derivative with respect to pp needs to be 00. Therefore, 0=n1p11p(k=1nxkn)    np+n1p=11pk=1nxk    np(1p)=11pk=1nxk    1p=1nk=1nxk \begin{align*} & 0 = n {{ 1 } \over { p }} - {{ 1 } \over { 1 - p }} \left( \sum_{k=1}^{n} x_{k} - n \right) \\ \implies & {{ n } \over { p }} + {{ n } \over { 1 - p }} = {{ 1 } \over { 1 - p }} \sum_{k=1}^{n} x_{k} \\ \implies & {{ n } \over { p(1-p) }} = {{ 1 } \over { 1 - p }} \sum_{k=1}^{n} x_{k} \\ \implies & {{ 1 } \over { p }} = {{ 1 } \over { n }} \sum_{k=1}^{n} x_{k} \end{align*}

Thus, the maximum likelihood estimator p^\hat{p} for pp is as follows. p^=nk=1nXk \hat{p} = {{ n } \over { \sum_{k=1}^{n} X_{k} }}