Location-Scale Family Auxiliary Statistics
📂Mathematical Statistics Location-Scale Family Auxiliary Statistics Theorem Let X 1 , ⋯ , X n X_{1} , \cdots , X_{n} X 1 , ⋯ , X n be a random sample from both a location family and a scale family. If the two statistics T 1 ( X 1 , ⋯ , X n ) T_{1} \left( X_{1} , \cdots, X_{n} \right) T 1 ( X 1 , ⋯ , X n ) and T 2 ( X 1 , ⋯ , X n ) T_{2} \left( X_{1} , \cdots , X_{n} \right) T 2 ( X 1 , ⋯ , X n ) satisfy
T i ( a x 1 + b , ⋯ , a x n + b ) = a T i ( x 1 , ⋯ , x n )
T_{i} \left( a x_{1} + b , \cdots , a x_{n} + b \right) = a T_{i} \left( x_{1} , \cdots , x_{n} \right)
T i ( a x 1 + b , ⋯ , a x n + b ) = a T i ( x 1 , ⋯ , x n )
for all x 1 , ⋯ , x n x_{1} , \cdots , x_{n} x 1 , ⋯ , x n and all constants b ∈ R b \in \mathbb{R} b ∈ R , a > 0 a > 0 a > 0 , then their ratio T 1 / T 2 T_{1}/T_{2} T 1 / T 2 is an ancillary statistic.
Proof Since X k X_{k} X k comes from a location-scale family, for some location parameter θ ∈ R \theta \in \mathbb{R} θ ∈ R and scale parameter σ > 0 \sigma > 0 σ > 0 , it can be expressed as
X k = θ + σ Z k
X_{k} = \theta + \sigma Z_{k}
X k = θ + σ Z k
Here, Z k Z_{k} Z k represents a sample drawn from f ( z ; θ = 0 , σ = 1 ) f (z ; \theta = 0, \sigma = 1) f ( z ; θ = 0 , σ = 1 ) . By assumption, the ratio of T 1 T_{1} T 1 to T 2 T_{2} T 2 is
T 1 ( X 1 , ⋯ , X n ) T 2 ( X 1 , ⋯ , X n ) = σ T 1 ( Z 1 , ⋯ , Z n ) σ T 2 ( Z 1 , ⋯ , Z n ) = T 1 ( Z 1 , ⋯ , Z n ) T 2 ( Z 1 , ⋯ , Z n )
{{T_{1} \left( X_{1} , \cdots , X_{n} \right) } \over {T_{2} \left( X_{1} , \cdots , X_{n} \right) }} = { \sigma {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over {\sigma T_{2} \left( Z_{1} , \cdots , Z_{n} \right) }} = { {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over { T_{2} \left( Z_{1} , \cdots , Z_{n} \right)}}
T 2 ( X 1 , ⋯ , X n ) T 1 ( X 1 , ⋯ , X n ) = σ T 2 ( Z 1 , ⋯ , Z n ) σ T 1 ( Z 1 , ⋯ , Z n ) = T 2 ( Z 1 , ⋯ , Z n ) T 1 ( Z 1 , ⋯ , Z n )
Therefore, it is an ancillary statistic that does not depend on θ \theta θ and σ \sigma σ .
■
Explanation Example As an example, the ratio of the range of a sample R R R to its sample standard deviation S S S is an ancillary statistic. Firstly, the range is
R ( σ Z 1 + θ , ⋯ , σ Z n + θ ) = R ( X 1 , ⋯ , X n ) = X ( n ) − X ( 1 ) = σ Z ( n ) + θ − σ Z ( 1 ) − θ = σ ( Z ( n ) − σ Z ( 1 ) ) = σ R ( Z 1 , ⋯ , Z n )
\begin{align*}
& R \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right)
\\ =& R \left( X_{1} , \cdots , X_{n} \right)
\\ =& X_{(n)} - X_{(1)}
\\ =& \sigma Z_{(n)} + \theta - \sigma Z_{(1)} - \theta
\\ =& \sigma \left( Z_{(n)} - \sigma Z_{(1)} \right)
\\ =& \sigma R \left( Z_{1} , \cdots , Z_{n} \right)
\end{align*}
= = = = = R ( σ Z 1 + θ , ⋯ , σ Z n + θ ) R ( X 1 , ⋯ , X n ) X ( n ) − X ( 1 ) σ Z ( n ) + θ − σ Z ( 1 ) − θ σ ( Z ( n ) − σ Z ( 1 ) ) σ R ( Z 1 , ⋯ , Z n )
and the sample standard deviation S S S is
S ( σ Z 1 + θ , ⋯ , σ Z n + θ ) = S ( X 1 , ⋯ , X n ) = 1 n − 1 ∑ i = 1 n ( X i − X ˉ ) 2 = 1 n − 1 ∑ i = 1 n ( σ Z i + θ − σ Z ˉ − θ ) 2 = 1 n − 1 ∑ i = 1 n σ 2 ( Z i − Z ˉ ) 2 = σ 1 n − 1 ∑ i = 1 n ( Z i − Z ˉ ) 2 = σ S ( Z 1 , ⋯ , Z n )
\begin{align*}
& S \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right)
\\ =& S \left( X_{1} , \cdots , X_{n} \right)
\\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( X_{i} - \bar{X} \right)^{2} }
\\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( \sigma Z_{i} + \theta - \sigma \bar{Z} - \theta \right)^{2} }
\\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \sigma^{2} \left( Z_{i} - \bar{Z} \right)^{2} }
\\ =& \sigma \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( Z_{i} - \bar{Z} \right)^{2} }
\\ =& \sigma S \left( Z_{1} , \cdots , Z_{n} \right)
\end{align*}
= = = = = = S ( σ Z 1 + θ , ⋯ , σ Z n + θ ) S ( X 1 , ⋯ , X n ) n − 1 1 i = 1 ∑ n ( X i − X ˉ ) 2 n − 1 1 i = 1 ∑ n ( σ Z i + θ − σ Z ˉ − θ ) 2 n − 1 1 i = 1 ∑ n σ 2 ( Z i − Z ˉ ) 2 σ n − 1 1 i = 1 ∑ n ( Z i − Z ˉ ) 2 σ S ( Z 1 , ⋯ , Z n )
Their ratio R / S R/S R / S essentially eliminates θ \theta θ , making it an ancillary statistic with respect to θ \theta θ , and since σ \sigma σ is canceled out in the numerator and denominator, it is also an ancillary statistic with respect to σ \sigma σ . Intuitively, this makes sense as both measure the dispersion of the data.