logo

Location-Scale Family Auxiliary Statistics 📂Mathematical Statistics

Location-Scale Family Auxiliary Statistics

Theorem 1

Let X1,,XnX_{1} , \cdots , X_{n} be a random sample from both a location family and a scale family. If the two statistics T1(X1,,Xn)T_{1} \left( X_{1} , \cdots, X_{n} \right) and T2(X1,,Xn)T_{2} \left( X_{1} , \cdots , X_{n} \right) satisfy Ti(ax1+b,,axn+b)=aTi(x1,,xn) T_{i} \left( a x_{1} + b , \cdots , a x_{n} + b \right) = a T_{i} \left( x_{1} , \cdots , x_{n} \right) for all x1,,xnx_{1} , \cdots , x_{n} and all constants bRb \in \mathbb{R}, a>0a > 0, then their ratio T1/T2T_{1}/T_{2} is an ancillary statistic.

Proof

Since XkX_{k} comes from a location-scale family, for some location parameter θR\theta \in \mathbb{R} and scale parameter σ>0\sigma > 0, it can be expressed as Xk=θ+σZk X_{k} = \theta + \sigma Z_{k}

Here, ZkZ_{k} represents a sample drawn from f(z;θ=0,σ=1)f (z ; \theta = 0, \sigma = 1). By assumption, the ratio of T1T_{1} to T2T_{2} is

T1(X1,,Xn)T2(X1,,Xn)=σT1(Z1,,Zn)σT2(Z1,,Zn)=T1(Z1,,Zn)T2(Z1,,Zn) {{T_{1} \left( X_{1} , \cdots , X_{n} \right) } \over {T_{2} \left( X_{1} , \cdots , X_{n} \right) }} = { \sigma {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over {\sigma T_{2} \left( Z_{1} , \cdots , Z_{n} \right) }} = { {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over { T_{2} \left( Z_{1} , \cdots , Z_{n} \right)}}

Therefore, it is an ancillary statistic that does not depend on θ\theta and σ\sigma.

Explanation

Example

As an example, the ratio of the range of a sample RR to its sample standard deviation SS is an ancillary statistic. Firstly, the range is

R(σZ1+θ,,σZn+θ)=R(X1,,Xn)=X(n)X(1)=σZ(n)+θσZ(1)θ=σ(Z(n)σZ(1))=σR(Z1,,Zn) \begin{align*} & R \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right) \\ =& R \left( X_{1} , \cdots , X_{n} \right) \\ =& X_{(n)} - X_{(1)} \\ =& \sigma Z_{(n)} + \theta - \sigma Z_{(1)} - \theta \\ =& \sigma \left( Z_{(n)} - \sigma Z_{(1)} \right) \\ =& \sigma R \left( Z_{1} , \cdots , Z_{n} \right) \end{align*}

and the sample standard deviation SS is

S(σZ1+θ,,σZn+θ)=S(X1,,Xn)=1n1i=1n(XiXˉ)2=1n1i=1n(σZi+θσZˉθ)2=1n1i=1nσ2(ZiZˉ)2=σ1n1i=1n(ZiZˉ)2=σS(Z1,,Zn) \begin{align*} & S \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right) \\ =& S \left( X_{1} , \cdots , X_{n} \right) \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( X_{i} - \bar{X} \right)^{2} } \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( \sigma Z_{i} + \theta - \sigma \bar{Z} - \theta \right)^{2} } \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \sigma^{2} \left( Z_{i} - \bar{Z} \right)^{2} } \\ =& \sigma \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( Z_{i} - \bar{Z} \right)^{2} } \\ =& \sigma S \left( Z_{1} , \cdots , Z_{n} \right) \end{align*}

Their ratio R/SR/S essentially eliminates θ\theta, making it an ancillary statistic with respect to θ\theta, and since σ\sigma is canceled out in the numerator and denominator, it is also an ancillary statistic with respect to σ\sigma. Intuitively, this makes sense as both measure the dispersion of the data.


  1. Casella. (2001). Statistical Inference(2nd Edition): p306. ↩︎