logo

Satterthwaite Approximation 📂Mathematical Statistics

Satterthwaite Approximation

Buildup

Let’s assume that we have $n$ independent random variables $Y_{k} \sim \chi_{r_{k}}^{2}$, each following a chi-squared distribution with degrees of freedom $r_{k}$. As is well-known, the sum of these, $\sum_{k=1}^{n} Y_{k}$, follows a chi-squared distribution with degrees of freedom $\sum_{k=1}^{n} r_{k}$. This insight can be particularly useful when looking at the denominator of $\displaystyle {{W} \over {\sqrt{V / r}}}$, which follows a t-distribution. Unfortunately, there’s an issue when this is directly applied to a Pooled Sample, which means a mix of heterogeneous populations. For instance, if there are given ratios or, more generally, weights $a_{1} , \cdots , a_{n} \in \mathbb{R}$, understanding the distribution of $$ \sum_{k=1}^{n} a_{k} Y_{k} $$ becomes quite challenging. While it seems to follow a chi-squared distribution, determining its exact degrees of freedom is difficult. To address this, Satterthwaite proposed a pretty good statistic under the assumption that $\sum a_{k} Y_{k}$ follows a chi-squared distribution. A key application of the Satterthwaite approximation is the hypothesis testing for the difference between two population means with small samples.

Formula

Let’s say $Y_{k} \sim \chi_{r_{k}}^{2}$ for $k = 1, \cdots , n$ and $a_{k} \in \mathbb{R}$. If we assume that for some $\nu > 0$, $$ \sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }} $$, then we can use the following $\hat{\nu}$ as an estimator. $$ \hat{\nu} = {{ \left( \sum_{k} a_{k} Y_{k} \right)^{2} } \over { \sum_{k} {{ a_{k}^{2} } \over { r_{k} }} Y_{k}^{2} }} $$

Derivation1

Method of Moments

We begin with the method of moments.

$$ \sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }} $$ Since the mean of the chi-squared distribution $\chi_{\nu}^{2}$ is $\nu$, $$ \begin{equation} E \sum_{k=1}^{n} a_{k} Y_{k} = 1 \label{1} \end{equation} $$ it follows that, since each of $Y_{k}$ are $E Y_{k} = r_{k}$ and $E \left( \chi_{\nu}^{2} / \nu \right) = 1$, from the $1$th moment, $$ \begin{align*} 1 =& E \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) \\ =& \sum_{k=1}^{n} a_{k} E Y_{k} \\ =& \sum_{k=1}^{n} a_{k} r_{k} \end{align*} $$ it is. Given that the mean of $\chi_{\nu}^{2}$ is $\nu$ and its variance is $2\nu$, from the $2$rd moment, $$ \begin{align*} E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& E \left[ \left( {{ \chi_{\nu}^{2} } \over { \nu }} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} E \left[ \left( \chi_{\nu}^{2} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} \left[ 2\nu + \nu^{2} \right] \\ =& {{ 2 } \over { \nu }} + 1 \end{align*} $$ it is. Organizing this in terms of $\nu$ yields the following estimator: $$ \hat{\nu} = {{ 2 } \over { \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} - 1 }} $$ This is a pretty good estimator, but the denominator can become problematic, diverging or even becoming negative as $\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2}$ approaches $1$. To overcome this risk, let’s delve deeper into $\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2}$.

Correction

Since it was $\eqref{1}$ from $E \sum_{k=1}^{n} a_{k} Y_{k} = 1$, by the property of variances $E Z^{2} = \operatorname{Var} Z + (EZ)^{2}$, we get $$ \begin{align*} =& {{ 2 } \over { \nu }} + 1 \\ E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) + \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \\ =& \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \left[ {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \\ =& 1^{2} \cdot \left[ {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \end{align*} $$ By organizing this obtained $$ {{ 2 } \over { \nu }} + 1 = {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 $$ with respect to $\nu$, we get $$ \nu = {{ 2 \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) }} $$ Direct calculation of the denominator’s $\operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)$ yields $\operatorname{Var} Y_{k} = 2 \left( E Y_{k} \right)^{2} / r_{k}$, so $$ \begin{align*} \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) =& \sum_{k=1}^{n} a_{k}^{2} \operatorname{Var} Y_{k} \\ =& \sum_{k=1}^{n} a_{k}^{2} {{ 2 \left( E Y_{k} \right)^{2} } \over { r_{k} }} \\ =& 2 \sum_{k=1}^{n} a_{k}^{2} {{ \left( E Y_{k} \right)^{2} } \over { r_{k} }} \end{align*} $$ When inserting this directly, $2$ gets simplified, yielding the following estimator: $$ \hat{\nu} = {{ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \sum_{k=1}^{n} a_{k}^{2} {{ \left( Y_{k} \right)^{2} } \over { r_{k} }}}} $$


  1. Casella. (2001). Statistical Inference(2nd Edition): p314. ↩︎