logo

Rao-Blackwell Theorem Proof 📂Mathematical Statistics

Rao-Blackwell Theorem Proof

Theorem 1 2

Description

To put the Rao-Blackwell Theorem into simple terms, it could be summarized as a theorem that ’tells why sufficient statistics are useful.’ An unbiased estimator becomes more effective, as in having a reduced variance, when information about the sufficient statistic is provided. Especially if TT is the minimum sufficient statistic, then ϕ(T)\phi \left( T \right) becomes the best unbiased estimator, as proven by the theorem.

Proof

Since from the assumption TT is a sufficient statistic, by its definition, the distribution of WTW | T is independent of θ\theta, and likewise, ϕ(T)=E(WT)\phi \left( T \right) = E \left( W | T \right) is independent of θ\theta.

Properties of Conditional Expectation: E[E(XY)]=E(X) E \left[ E ( X | Y ) \right] = E(X)

By the properties of conditional expectation: τ(θ)=EθW=Eθ[E(WT)]=Eθϕ(T) \begin{align*} \tau (\theta) =& E_{\theta} W \\ =& E_{\theta} \left[ E ( W | T ) \right] \\ =& E_{\theta} \phi (T) \end{align*}

Thus, ϕ(T)\phi (T) is an unbiased estimator for τ(θ)\tau (\theta).

Properties of Conditional Variance: Var(X)=E(Var(XY))+Var(E(XY)) \operatorname{Var}(X) = E \left( \operatorname{Var}(X | Y) \right) + \operatorname{Var}(E(X | Y))

By the properties of conditional variance: VarθW=Varθ[E(WT)]+Eθ[Var(WT)]=Varθϕ(T)+Eθ[Var(WT)]Varθϕ(T)Var(WT)0 \begin{align*} \operatorname{Var}_{\theta} W =& \operatorname{Var}_{\theta} \left[ E ( W | T ) \right] + E_{\theta} \left[ \operatorname{Var} ( W | T ) \right] \\ =& \operatorname{Var}_{\theta} \phi (T) + E_{\theta} \left[ \operatorname{Var} ( W | T ) \right] \\ \ge& \operatorname{Var}_{\theta} \phi (T) & \because \operatorname{Var} ( W | T ) \ge 0 \end{align*}

Therefore, Varθϕ(T)\operatorname{Var}_{\theta} \phi (T) always has a smaller variance than WW.


  1. Casella. (2001). Statistical Inference(2nd Edition): p342. ↩︎

  2. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p397. Let us assume that the parameter θ\theta is given. If TT is a sufficient statistic for θ\theta and WW is an unbiased estimator for τ(θ)\tau \left( \theta \right), then by defining ϕ(T):=E(WT)\phi \left( T \right) := E \left( W | T \right), the following holds for all θ\theta: Eθϕ(T)=τ(θ)Varθϕ(T)VarθW \begin{align*} E_{\theta} \phi (T) =& \tau (\theta) \\ \operatorname{Var}_{\theta} \phi (T) \le& \operatorname{Var}_{\theta} W \end{align*} In other words, ϕ(T)\phi (T) is a Uniformly Better Unbiased Estimator for τ(θ)\tau (\theta) than WW↩︎