Location-Scale Family Auxiliary Statistics 📂Mathematical Statistics

Location-Scale Family Auxiliary Statistics

Theorem ¹

Let $X_{1} , \cdots , X_{n}$ be a random sample from both a location family and a scale family. If the two statistics $T_{1} \left( X_{1} , \cdots, X_{n} \right)$ and $T_{2} \left( X_{1} , \cdots , X_{n} \right)$ satisfy $$ T_{i} \left( a x_{1} + b , \cdots , a x_{n} + b \right) = a T_{i} \left( x_{1} , \cdots , x_{n} \right) $$ for all $x_{1} , \cdots , x_{n}$ and all constants $b \in \mathbb{R}$, $a > 0$, then their ratio $T_{1}/T_{2}$ is an ancillary statistic.

Proof

Since $X_{k}$ comes from a location-scale family, for some location parameter $\theta \in \mathbb{R}$ and scale parameter $\sigma > 0$, it can be expressed as $$ X_{k} = \theta + \sigma Z_{k} $$

Here, $Z_{k}$ represents a sample drawn from $f (z ; \theta = 0, \sigma = 1)$. By assumption, the ratio of $T_{1}$ to $T_{2}$ is

$$ {{T_{1} \left( X_{1} , \cdots , X_{n} \right) } \over {T_{2} \left( X_{1} , \cdots , X_{n} \right) }} = { \sigma {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over {\sigma T_{2} \left( Z_{1} , \cdots , Z_{n} \right) }} = { {T_{1} \left( Z_{1} , \cdots , Z_{n} \right) } \over { T_{2} \left( Z_{1} , \cdots , Z_{n} \right)}} $$

Therefore, it is an ancillary statistic that does not depend on $\theta$ and $\sigma$.

■

Explanation

Example

As an example, the ratio of the range of a sample $R$ to its sample standard deviation $S$ is an ancillary statistic. Firstly, the range is

$$ \begin{align*} & R \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right) \\ =& R \left( X_{1} , \cdots , X_{n} \right) \\ =& X_{(n)} - X_{(1)} \\ =& \sigma Z_{(n)} + \theta - \sigma Z_{(1)} - \theta \\ =& \sigma \left( Z_{(n)} - \sigma Z_{(1)} \right) \\ =& \sigma R \left( Z_{1} , \cdots , Z_{n} \right) \end{align*} $$

and the sample standard deviation $S$ is

$$ \begin{align*} & S \left( \sigma Z_{1} + \theta , \cdots , \sigma Z_{n} + \theta \right) \\ =& S \left( X_{1} , \cdots , X_{n} \right) \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( X_{i} - \bar{X} \right)^{2} } \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( \sigma Z_{i} + \theta - \sigma \bar{Z} - \theta \right)^{2} } \\ =& \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \sigma^{2} \left( Z_{i} - \bar{Z} \right)^{2} } \\ =& \sigma \sqrt{ {{1} \over {n-1}} \sum_{i=1}^{n} \left( Z_{i} - \bar{Z} \right)^{2} } \\ =& \sigma S \left( Z_{1} , \cdots , Z_{n} \right) \end{align*} $$

Their ratio $R/S$ essentially eliminates $\theta$, making it an ancillary statistic with respect to $\theta$, and since $\sigma$ is canceled out in the numerator and denominator, it is also an ancillary statistic with respect to $\sigma$. Intuitively, this makes sense as both measure the dispersion of the data.

Casella. (2001). Statistical Inference(2nd Edition): p306. ↩︎