logo

Satterthwaite Approximation 📂Mathematical Statistics

Satterthwaite Approximation

Buildup

Let’s assume that we have nn independent random variables Ykχrk2Y_{k} \sim \chi_{r_{k}}^{2}, each following a chi-squared distribution with degrees of freedom rkr_{k}. As is well-known, the sum of these, k=1nYk\sum_{k=1}^{n} Y_{k}, follows a chi-squared distribution with degrees of freedom k=1nrk\sum_{k=1}^{n} r_{k}. This insight can be particularly useful when looking at the denominator of WV/r\displaystyle {{W} \over {\sqrt{V / r}}}, which follows a t-distribution. Unfortunately, there’s an issue when this is directly applied to a Pooled Sample, which means a mix of heterogeneous populations. For instance, if there are given ratios or, more generally, weights a1,,anRa_{1} , \cdots , a_{n} \in \mathbb{R}, understanding the distribution of k=1nakYk \sum_{k=1}^{n} a_{k} Y_{k} becomes quite challenging. While it seems to follow a chi-squared distribution, determining its exact degrees of freedom is difficult. To address this, Satterthwaite proposed a pretty good statistic under the assumption that akYk\sum a_{k} Y_{k} follows a chi-squared distribution. A key application of the Satterthwaite approximation is the hypothesis testing for the difference between two population means with small samples.

Formula

Let’s say Ykχrk2Y_{k} \sim \chi_{r_{k}}^{2} for k=1,,nk = 1, \cdots , n and akRa_{k} \in \mathbb{R}. If we assume that for some ν>0\nu > 0, k=1nakYkχν2ν \sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }} , then we can use the following ν^\hat{\nu} as an estimator. ν^=(kakYk)2kak2rkYk2 \hat{\nu} = {{ \left( \sum_{k} a_{k} Y_{k} \right)^{2} } \over { \sum_{k} {{ a_{k}^{2} } \over { r_{k} }} Y_{k}^{2} }}

Derivation1

Method of Moments

We begin with the method of moments.

k=1nakYkχν2ν \sum_{k=1}^{n} a_{k} Y_{k} \sim {{ \chi_{\nu}^{2} } \over { \nu }} Since the mean of the chi-squared distribution χν2\chi_{\nu}^{2} is ν\nu, Ek=1nakYk=1 \begin{equation} E \sum_{k=1}^{n} a_{k} Y_{k} = 1 \label{1} \end{equation} it follows that, since each of YkY_{k} are EYk=rkE Y_{k} = r_{k} and E(χν2/ν)=1E \left( \chi_{\nu}^{2} / \nu \right) = 1, from the 11th moment, 1=E(k=1nakYk)=k=1nakEYk=k=1nakrk \begin{align*} 1 =& E \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) \\ =& \sum_{k=1}^{n} a_{k} E Y_{k} \\ =& \sum_{k=1}^{n} a_{k} r_{k} \end{align*} it is. Given that the mean of χν2\chi_{\nu}^{2} is ν\nu and its variance is 2ν2\nu, from the 22rd moment, E[(k=1nakYk)2]=E[(χν2ν)2]=1ν2E[(χν2)2]=1ν2[2ν+ν2]=2ν+1 \begin{align*} E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& E \left[ \left( {{ \chi_{\nu}^{2} } \over { \nu }} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} E \left[ \left( \chi_{\nu}^{2} \right)^{2} \right] \\ =& {{ 1 } \over { \nu^{2} }} \left[ 2\nu + \nu^{2} \right] \\ =& {{ 2 } \over { \nu }} + 1 \end{align*} it is. Organizing this in terms of ν\nu yields the following estimator: ν^=2(k=1nakYk)21 \hat{\nu} = {{ 2 } \over { \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} - 1 }} This is a pretty good estimator, but the denominator can become problematic, diverging or even becoming negative as (k=1nakYk)2\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} approaches 11. To overcome this risk, let’s delve deeper into (k=1nakYk)2\left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2}.

Correction

Since it was (1)\eqref{1} from Ek=1nakYk=1E \sum_{k=1}^{n} a_{k} Y_{k} = 1, by the property of variances EZ2=VarZ+(EZ)2E Z^{2} = \operatorname{Var} Z + (EZ)^{2}, we get =2ν+1E[(k=1nakYk)2]=Var(k=1nakYk)+(Ek=1nakYk)2=(Ek=1nakYk)2[Var(k=1nakYk)(Ek=1nakYk)2+1]=12[Var(k=1nakYk)(Ek=1nakYk)2+1] \begin{align*} =& {{ 2 } \over { \nu }} + 1 \\ E \left[ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \right] =& \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) + \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \\ =& \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} \left[ {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \\ =& 1^{2} \cdot \left[ {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 \right] \end{align*} By organizing this obtained 2ν+1=Var(k=1nakYk)(Ek=1nakYk)2+1 {{ 2 } \over { \nu }} + 1 = {{ \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) } \over { \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} }} + 1 with respect to ν\nu, we get ν=2(Ek=1nakYk)2Var(k=1nakYk) \nu = {{ 2 \left( E \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) }} Direct calculation of the denominator’s Var(k=1nakYk)\operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) yields VarYk=2(EYk)2/rk\operatorname{Var} Y_{k} = 2 \left( E Y_{k} \right)^{2} / r_{k}, so Var(k=1nakYk)=k=1nak2VarYk=k=1nak22(EYk)2rk=2k=1nak2(EYk)2rk \begin{align*} \operatorname{Var} \left( \sum_{k=1}^{n} a_{k} Y_{k} \right) =& \sum_{k=1}^{n} a_{k}^{2} \operatorname{Var} Y_{k} \\ =& \sum_{k=1}^{n} a_{k}^{2} {{ 2 \left( E Y_{k} \right)^{2} } \over { r_{k} }} \\ =& 2 \sum_{k=1}^{n} a_{k}^{2} {{ \left( E Y_{k} \right)^{2} } \over { r_{k} }} \end{align*} When inserting this directly, 22 gets simplified, yielding the following estimator: ν^=(k=1nakYk)2k=1nak2(Yk)2rk \hat{\nu} = {{ \left( \sum_{k=1}^{n} a_{k} Y_{k} \right)^{2} } \over { \sum_{k=1}^{n} a_{k}^{2} {{ \left( Y_{k} \right)^{2} } \over { r_{k} }}}}


  1. Casella. (2001). Statistical Inference(2nd Edition): p314. ↩︎