Satterthwaite Approximation 📂Mathematical Statistics

Satterthwaite Approximation

Buildup

Let’s assume that we have $n$ independent random variables $Y_{k} \sim \chi_{r_{k}}^{2}$ , each following a chi-squared distribution with degrees of freedom $r_{k}$ . As is well-known, the sum of these, $\sum_{k=1}^{n} Y_{k}$ , follows a chi-squared distribution with degrees of freedom $\sum_{k=1}^{n} r_{k}$ . This insight can be particularly useful when looking at the denominator of $\displaystyle {{W} \over {\sqrt{V / r}}}$ , which follows a t-distribution. Unfortunately, there’s an issue when this is directly applied to a Pooled Sample, which means a mix of heterogeneous populations. For instance, if there are given ratios or, more generally, weights $a_{1} , \cdots , a_{n} \in \mathbb{R}$ , understanding the distribution of $\sum_{k=1}^{n} a_{k} Y_{k}$ becomes quite challenging. While it seems to follow a chi-squared distribution, determining its exact degrees of freedom is difficult. To address this, Satterthwaite proposed a pretty good statistic under the assumption that $\sum a_{k} Y_{k}$ follows a chi-squared distribution. A key application of the Satterthwaite approximation is the hypothesis testing for the difference between two population means with small samples.

Formula

Let’s say $Y_{k} \sim \chi_{r_{k}}^{2}$ for $k = 1, \cdots , n$ and $a_{k} \in \mathbb{R}$ . If we assume that for some $\nu > 0$ , $\sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }}$ , then we can use the following $\hat{\nu}$ as an estimator. $\hat{\nu} = {{ \left( \sum_{k} a_{k} Y_{k} \right)^{2} } \over { \sum_{k} {{ a_{k}^{2} } \over { r_{k} }} Y_{k}^{2} }}$

Derivation¹

Method of Moments

We begin with the method of moments.

$\sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }}$ Since the mean of the chi-squared distribution $\chi_{\nu}^{2}$ is $\nu$ , $\begin{equation} E \sum_{k=1}^{n} a_{k} Y_{k} = 1 \label{1} \end{equation}$ it follows that, since each of $Y_{k}$ are $E Y_{k} = r_{k}$ and $E \left( \chi_{\nu}^{2} / \nu \right) = 1$ , from the $1$ th moment, $\begin{align*} 1 =& E \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) \\ =& \sum_{k=1}^{n} a_{k} E Y_{k} \\ =& \sum_{k=1}^{n} a_{k} r_{k} \end{align*}$ it is. Given that the mean of $\chi_{\nu}^{2}$ is $\nu$ and its variance is $2\nu$ , from the $2$ rd moment, $\begin{align*} E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& E \left[ \left( {{ \chi_{\nu}^{2} } \over { \nu }} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} E \left[ \left( \chi_{\nu}^{2} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} \left[ 2\nu + \nu^{2} \right] \\ =& {{ 2 } \over { \nu }} + 1 \end{align*}$ it is. Organizing this in terms of $\nu$ yields the following estimator: $\hat{\nu} = {{ 2 } \over { \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} - 1 }}$ This is a pretty good estimator, but the denominator can become problematic, diverging or even becoming negative as $\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2}$ approaches $1$ . To overcome this risk, let’s delve deeper into $\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2}$ .

Correction

Since it was $\eqref{1}$ from $E \sum_{k=1}^{n} a_{k} Y_{k} = 1$ , by the property of variances $E Z^{2} = \Var Z + (EZ)^{2}$ , we get $\begin{align*} =& {{ 2 } \over { \nu }} + 1 \\ E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) + \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \\ =& \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \left[ {{ \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \\ =& 1^{2} \cdot \left[ {{ \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \end{align*}$ By organizing this obtained ${{ 2 } \over { \nu }} + 1 = {{ \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1$ with respect to $\nu$ , we get $\nu = {{ 2 \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) }}$ Direct calculation of the denominator’s $\Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)$ yields $\Var Y_{k} = 2 \left( E Y_{k} \right)^{2} / r_{k}$ , so $\begin{align*} \Var \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) =& \sum_{k=1}^{n} a_{k}^{2} \Var Y_{k} \\ =& \sum_{k=1}^{n} a_{k}^{2} {{ 2 \left( E Y_{k} \right)^{2} } \over { r_{k} }} \\ =& 2 \sum_{k=1}^{n} a_{k}^{2} {{ \left( E Y_{k} \right)^{2} } \over { r_{k} }} \end{align*}$ When inserting this directly, $2$ gets simplified, yielding the following estimator: $\hat{\nu} = {{ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \sum_{k=1}^{n} a_{k}^{2} {{ \left( Y_{k} \right)^{2} } \over { r_{k} }}}}$

■

Casella. (2001). Statistical Inference(2nd Edition): p314. ↩︎