Sufficient Statistics and Maximum Likelihood Estimators for the Binomial Distribution
📂Probability DistributionSufficient Statistics and Maximum Likelihood Estimators for the Binomial Distribution
Theorem
Let’s assume we have a random sample X:=(X1,⋯,Xn)∼U(0,θ) following a uniform distribution.
The sufficient statistic T and maximum likelihood estimator θ^ for θ are as follows:
T=θ^=k=1,⋯,nmaxXkk=1,⋯,nmaxXk
Proof
Strategy: The sufficient statistic and maximum likelihood estimator of the uniform distribution are must-know statistics due to their practicality aside, especially for homework, midterms, and final exams. They can be derived directly by definition, but at first, this is not always straightforward.
Sufficient statistic and maximum likelihood estimator for location families: Suppose we have a random sample X1,⋯,Xn∼X from a location family with a probability density function of fX(x;θ)=fX(x−θ). The sufficient statistic and maximum likelihood estimator are
- For a support of X that is bounded above, use maxXk
- For a support of X that is bounded below, use minXk
depending on the condition.
U(0,θ) is a location family, and while the sufficient statistic and maximum likelihood estimator for the location family can easily be guessed by corollaries, let’s derive them directly for an intuitive understanding.
Sufficient Statistic
Product of indicator functions: i=1∏nI(−∞,θ](xi)=I(−∞,θ](i∈[n]maxxi)
f(x;θ)====k=1∏nf(xk;θ)k=1∏nθ1I[0,θ](xk)θn1I[0,θ](maxxk)θn1I[0,θ](maxxk)⋅1
Neyman Factorization Theorem: Consider a random sample X1,⋯,Xn with the same probability mass/density function f(x;θ) for parameter θ∈Θ. A statistic Y=u1(X1,⋯,Xn) is a sufficient statistic for θ if it satisfies the existence of two non-negative functions k1,k2≥0 such that:
f(x1;θ)⋯f(xn;θ)=k1[u1(x1,⋯,xn);θ]k2(x1,⋯,xn)
However, k2 must not depend on θ.
Based on the Neyman Factorization Theorem, T:=maxXk is a sufficient statistic for θ.
Maximum Likelihood Estimator
L(θ;x)=f(x;θ)=I[0,θ](maxxk)
The likelihood function of the random sample can be determined as shown, eliminating the need to specifically use the partial derivative of the indicator function.
Definition of Maximum Likelihood Estimator: An estimator θ^:=θ^(X) that satisfies the following is called Maximum Likelihood Estimator, or MLE for short:
θ^=argmaxL(θ;X)
Based on the Definition of Maximum Likelihood Estimator, instead of extensively worrying about the likelihood function, focusing solely on θ^≥maxXk suffices, because if θ^<maxXk then it results in L=0. As illustrated by this explanation and definition, the maximum likelihood estimator doesn’t have to be uniquely maxXk, nor does it need to be; hence, there’s no point in unnecessarily considering maxXk+700 when maxXk is sufficient.
■