logo

Neumann Factorization Theorem Proof 📂Mathematical Statistics

Neumann Factorization Theorem Proof

Theorem

Let’s say a random sample X1,,XnX_{1} , \cdots , X_{n} has the same probability mass/density function f(x;θ)f \left( x ; \theta \right) for a parameter θΘ\theta \in \Theta. Statistic Y=u1(X1,,Xn)Y = u_{1} \left( X_{1} , \cdots , X_{n} \right) is a sufficient statistic for θ\theta if there exist two non-negative functions k1,k20k_{1} , k_{2} \ge 0 that satisfy the following. f(x1;θ)f(xn;θ)=k1[u1(x1,,xn);θ]k2(x1,,xn) f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) = k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] k_{2} \left( x_{1} , \cdots , x_{n} \right) Here, k2k_{2} must not depend on θ\theta.

Proof

Definition of Sufficient Statistic: For a function H(x1,,xn)H \left( x_{1} , \cdots , x_{n} \right) that does not depend on θΘ\theta \in \Theta, f(x1;θ)f(xn;θ)fY1(u1(x1,,xn);θ)=H(x1,,xn) {{ f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) } \over { f_{Y_{1}} \left( u_{1} \left( x_{1} , \cdots, x_{n} \right) ; \theta \right) }} = H \left( x_{1} , \cdots , x_{n} \right) if that is true, then Y1Y_{1} is called a Sufficient Statistic for θ\theta.

We prove this only for continuous probability distributions. Refer to Casella for proofs on discrete probability distributions.


()(\Rightarrow)

As per the definition of sufficient statistic, it is obvious since fY1f_{Y_{1}} corresponds to k1k_{1}, and HH to f2f_{2}.


()(\Leftarrow)

y1:=u1(x1,,xn)y2:=u2(x1,,xn)yn:=un(x1,,xn) \begin{align*} y_{1} &:= u_{1} \left( x_{1} , \cdots , x_{n} \right) \\ y_{2} &:= u_{2} \left( x_{1} , \cdots , x_{n} \right) \\ &\vdots \\ y_{n} &:= u_{n} \left( x_{1} , \cdots , x_{n} \right) \end{align*}

Let’s denote the inverse functions of the above functions for convenience and represent the Jacobian as JJ.

x1:=w1(y1,,yn)x2:=w2(y1,,yn)xn:=wn(y1,,yn) \begin{align*} x_{1} &:= w_{1} \left( y_{1} , \cdots , y_{n} \right) \\ x_{2} &:= w_{2} \left( y_{1} , \cdots , y_{n} \right) \\ &\vdots \\ x_{n} &:= w_{n} \left( y_{1} , \cdots , y_{n} \right) \end{align*}

Then, the joint probability density function gg of Y1,,YnY_{1} , \cdots , Y_{n} for wi=wi(y1,,yn)w_{i} = w_{i} \left( y_{1} , \cdots , y_{n} \right) is g(y1,,yn;θ)=k1(y1;θ)k2(w1,,wn)J g \left( y_{1} , \cdots , y_{n} ; \theta \right) = k_{1} \left( y_{1} ; \theta \right) k_{2} \left( w_{1} , \cdots , w_{n} \right) \left| J \right| and, the marginal probability density function fY1f_{Y_{1}} of Y1Y_{1} is fY1(y1;θ)=g(y1,,yn;θ)dy2dyn=k1(y1;θ)Jk2(w1,,wn)dy2dyn \begin{align*} f_{Y_{1}} \left( y_{1} ; \theta \right) =& \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} g \left( y_{1} , \dots , y_{n} ; \theta \right) d y_{2} \cdots d y_{n} \\ =& k_{1} \left( y_{1} ; \theta \right) \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} \left| J \right| k_{2} \left( w_{1} , \dots , w_{n} \right) d y_{2} \cdots d y_{n} \end{align*} k2k_{2}, being a function that does not depend on θ\theta and since JJ also does not involve θ\theta, the right-hand integral can be expressed as a function solely of y1y_{1}, which we’ll temporarily denote as m(y1)m \left( y_{1} \right). fY1(y1;θ)=k1(y1;θ)m(y1) f_{Y_{1}} \left( y_{1} ; \theta \right) = k_{1} \left( y_{1} ; \theta \right) m \left( y_{1} \right) Here, if m(y1)=0m \left( y_{1} \right) = 0, it is trivially fY1(y1;θ)=0f_{Y_{1}} \left( y_{1} ; \theta \right) = 0. Now, assuming m(y1)>0m \left( y_{1} \right) > 0, it can be written as follows. k1[u1(x1,,xn);θ]=fY1[u1(x1,,xn);θ]m[u1(x1,,xn)] k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] = {{ f_{Y_{1}} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] } \over { m \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) \right] }} Substituting the given expression yields f(x1;θ)f(xn;θ)=k1[u1(x1,,xn);θ]k2(x1,,xn)=fY1[u1(x1,,xn);θ]m[u1(x1,,xn)]k2(x1,,xn)=fY1[u1(x1,,xn);θ]k2(x1,,xn)m[u1(x1,,xn)] \begin{align*} f \left( x_{1} ; \theta \right) \cdots f \left( x_{n} ; \theta \right) =& k_{1} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] k_{2} \left( x_{1} , \cdots , x_{n} \right) \\ =& {{ f_{Y_{1}} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] } \over { m \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) \right] }} k_{2} \left( x_{1} , \cdots , x_{n} \right) \\ =& f_{Y_{1}} \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) ; \theta \right] {{ k_{2} \left( x_{1} , \cdots , x_{n} \right) } \over { m \left[ u_{1} \left( x_{1} , \cdots , x_{n} \right) \right] }} \end{align*} Since both k2k_{2} and mm do not depend on θ\theta, by definition, Y1Y_{1} is a sufficient statistic for θ\theta.