Most Powerful Test Containing Sufficient Statistics
Theorem
Hypothesis Testing: $$ \begin{align*} H_{0} :& \theta = \theta_{0} \\ H_{1} :& \theta = \theta_{1} \end{align*} $$
In such hypothesis testing, let us call the probability density function or probability mass function for $\theta_{0}, \theta_{1}$ of sufficient statistic $T$ considering $\theta$ as $g \left( t | \theta_{0} \right), g \left( t | \theta_{1} \right)$. Then, given a rejection region $S$ and a certain constant $k \ge 0$, all hypothesis tests dependent on $T$ are the most powerful tests at level $\alpha$ if they satisfy the following three conditions:
- (i): If $g \left( t | \theta_{1} \right) > k g \left( t | \theta_{0} \right)$ then $t \in S$
- (ii): If $g \left( t | \theta_{1} \right) < k g \left( t | \theta_{0} \right)$ then $t \in S^{c}$
- (iii): $\alpha = P_{\theta_{0}} \left( T \in S \right)$
Explanation
This theorem is essentially a corollary of the Pearson-Neyman lemma. It not only plays a role in the proof of the Karlin-Rubin theorem but also suggests that sufficient statistics can be used to conveniently design the most powerful test.
Proof 1
The rejection region for the original sample $\mathbf{X}$ is $R = \left\{ \mathbf{x} : T \left( \mathbf{X} \right) \in S \right\}$. According to the Neyman factorization theorem, the probability density function or probability mass function of $\mathbf{X}$ can be represented as follows by a non-negative function $h \left( \mathbf{x} \right)$: $$ f \left( \mathbf{x} | \theta_{i} \right) = g \left( T \left( \mathbf{x} \right) | \theta_{i} \right) h \left( \mathbf{x} \right) \qquad , i = 0,1 $$ Assuming that the conditions (i) and (ii) required by the theorem are satisfied: $$ \begin{align*} \mathbf{x} \in R \impliedby & f \left( \mathbf{x} | \theta_{1} \right) > g \left( T \left( \mathbf{x} \right) | \theta_{1} \right) h \left( \mathbf{x} \right) = k g \left( t | \theta_{0} \right) = k f \left( \mathbf{x} | \theta_{0} \right) \\ \mathbf{x} \in R^{c} \impliedby & f \left( \mathbf{x} | \theta_{1} \right) < g \left( T \left( \mathbf{x} \right) | \theta_{1} \right) h \left( \mathbf{x} \right) = k g \left( t | \theta_{0} \right) = k f \left( \mathbf{x} | \theta_{0} \right) \end{align*} $$ And based on condition (iii), the following holds: $$ P_{\theta_{0}} \left( \mathbf{X} \in R \right) = P_{\theta_{0}} \left( T \left( \mathbf{X} \right) \in S \right) = \alpha $$
Pearson-Neyman Lemma: In such hypothesis testing, let us call the probability density function or probability mass function for $\theta_{0}, \theta_{1}$ as $f \left( \mathbf{x} | \theta_{0} \right), f \left( \mathbf{x} | \theta_{1} \right)$ and given a rejection region $R$ and a certain constant $k \ge 0$, if
- (i): If $f \left( \mathbf{x} | \theta_{1} \right) > k f \left( \mathbf{x} | \theta_{0} \right)$ then $\mathbf{x} \in R$
- (ii): If $f \left( \mathbf{x} | \theta_{1} \right) < k f \left( \mathbf{x} | \theta_{0} \right)$ then $\mathbf{x} \in R^{c}$
- (iii): $\alpha = P_{\theta_{0}} \left( \mathbf{X} \in R \right)$
then, the following two propositions are equivalent:
- All hypothesis tests that satisfy the above three conditions are the most powerful tests at level $\alpha$.
- If a hypothesis test that satisfies these three conditions with a constant $k > 0$ exists, then all most powerful tests at level $\alpha$, except for the set $A \subset \Omega$, $$ P_{\theta_{0}} \left( \mathbf{X} \in A \right) = P_{\theta_{1}} \left( \mathbf{X} \in A \right) = 0 $$ satisfy (i) and (ii), and are exactly the most powerful tests at size $\alpha$.
According to $(\impliedby)$ of the Pearson-Neyman lemma, the given hypothesis test is the most powerful test.
■
Casella. (2001). Statistical Inference(2nd Edition): p389. ↩︎