Sufficient Statistics and Maximum Likelihood Estimators of the Poisson Distribution
Theorem
Given a random sample $\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim \text{Poi} \left( \lambda \right)$ following a Poisson distribution,
the sufficient statistic $T$ and the maximum likelihood estimator $\hat{\lambda}$ for $\lambda$ are as follows: $$ \begin{align*} T =& \sum_{k=1}^{n} X_{k} \\ \hat{\lambda} =& {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k} \end{align*} $$
Proof
Sufficient Statistic
$$ \begin{align*} f \left( \mathbf{x} ; \lambda \right) =& \prod_{k=1}^{n} f \left( x_{k} ; \lambda \right) \\ =& \prod_{k=1}^{n} {{ e^{-\lambda} \lambda^{x_{k}} } \over { x_{k} ! }} \\ =& {{ e^{-n \lambda} \lambda^{ \sum_{k} x_{k}} } \over { \prod_{k} x_{k} ! }} \\ =& e^{-n \lambda} \lambda^{ \sum_{k} x_{k}} \cdot {{ 1 } \over { \prod_{k} x_{k} ! }} \end{align*} $$
Neyman Factorization Theorem: Let a random sample $X_{1} , \cdots , X_{n}$ have the same probability mass/density function $f \left( x ; \theta \right)$ for parameter $\theta \in \Theta$. A statistic $Y = u_{1} \left( X_{1} , \cdots , X_{n} \right)$ is a sufficient statistic for $\theta$ if there exist two non-negative functions $k_{1} , k_{2} \ge 0$ satisfying the following condition: $$ f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) = k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] k_{2} \left( x_{1} , \cdots , x_{n} \right) $$ Note, $k_{2}$ must not depend on $\theta$.
According to the Neyman Factorization Theorem, $T := \sum_{k} X_{k}$ is the sufficient statistic for $\lambda$.
Maximum Likelihood Estimator
$$ \begin{align*} \log L \left( \lambda ; \mathbf{x} \right) =& \log f \left( \mathbf{x} ; \lambda \right) \\ =& \log {{ e^{-n \lambda} \lambda^{ \sum_{k} x_{k}} } \over { \prod_{k} x_{k} ! }} \\ =& -n\lambda + \sum_{k=1}^{n} x_{k} \log \lambda - \log \prod_{k} x_{k} ! \end{align*} $$
The log-likelihood function of the random sample is as above, and for the likelihood function to be maximized, the partial derivative with respect to $\lambda$ must be $0$, therefore $$ \begin{align*} & 0 = - n + \sum_{k=1}^{n} x_{k} {{ 1 } \over { \lambda }} \\ \implies & \lambda = {{ 1 } \over { n }} \sum_{k=1}^{n} x_{k} \end{align*} $$
Consequently, the maximum likelihood estimator $\hat{\lambda}$ for $\lambda$ is as follows: $$ \hat{\lambda} = {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k} $$
■