Student's t-test Proof
Theorem 1
If random variables $X_{1} , \cdots , X_{n}$ are iid and follow a normal distribution $N\left( \mu,\sigma^{2} \right)$, then
- (a): $$ \overline{X} \sim N\left( \mu , { {\sigma^2} \over {n} } \right) $$
- (b): $$ \overline{X} \perp S^2 $$
- (c): $$ (n-1) { {S^2} \over {\sigma^2} } \sim \chi^2 (n-1) $$
- (d): $$ T = { {\overline{X} - \mu } \over {S / \sqrt{n}} } \sim t(n-1) $$
- Sample mean $\overline{X}$ and sample variance $S^{2}$ are defined as follows as random variables.
$$ \overline{X} := {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k} \\ S^{2} := {{ 1 } \over { n-1 }} \sum_{k=1}^{n} \left( X_{k} - \overline{X} \right)^{2} $$
Description
It’s commonly used among statisticians, but in fact, it also has a name. It is divided into four parts, making it difficult to cite specifically.
(b) is a fact that seems obvious if it’s obvious and strange if it’s strange, that is, despite both sample mean and sample variance coming from the same data, they are independent.
Inference about the population mean for small samples
The proof of Student’s theorem is the derivation itself of hypothesis testing for the population mean with a small sample.
Proof
(a)
Since $\displaystyle \overline{X} = { { (X_1 + X_2 + \cdots + X_n )} \over {n}}$, when thinking about the sum of random variables that follow a normal distribution $$ \overline{X} \sim N \left( \mu, {{1} \over {n}} \sigma^2 \right) $$
■
(b)
- $\mathbf{0}$ represents the zero vector.
- $\mathbf{1} = (1, \cdots , 1) = \begin{bmatrix} 1 \\ \vdots \\ 1 \end{bmatrix}$ represents a vector whose components are all $1$.
- $I$ represents an identity matrix.
- $A^{T}$ represents the transpose matrix of matrix $A$.
Let’s say $\displaystyle \mathbf{v} := {{ 1 } \over { n }} \mathbf{1}$.
Random vector $X := \left( X_{1} , \cdots , X_{n} \right)$ follows a multivariate normal distribution $$ \begin{align*} \overline{X} =& {{ 1 } \over { n }} \left( X_{1} + \cdots + X_{n} \right) \\ &= {{ 1 } \over { n }} \begin{bmatrix} 1 & \cdots & 1 \end{bmatrix} \begin{bmatrix} X_{1} \\ \vdots \\ X_{n} \end{bmatrix} \\ =& {{ 1 } \over { n }} \mathbf{1}^{T} \mathbf{X} \\ =& \mathbf{v}^{T} \mathbf{X} \end{align*} $$
Now, defining a random vector $\mathbf{Y} := \left( X_{1} - \overline{X} , \cdots , X_{n} - \overline{X} \right)$, some random vector $\mathbf{W}$ can be represented as follows. $$ \mathbf{W} = \begin{bmatrix} \overline{X} \\ \mathbf{Y} \end{bmatrix} = \begin{bmatrix} \mathbf{v}^{T} \\ I - \mathbf{1} \mathbf{v}^{T} \end{bmatrix} \mathbf{X} $$
Since $\mathbf{W}$ is a linear transformation of random vectors that follow a multivariate normal distribution, it still follows a multivariate normal distribution, and its mean vector is $$ E \mathbf{W} = \begin{bmatrix} \mathbf{v}^{T} \\ I - \mathbf{1} \mathbf{v}^{T} \end{bmatrix} \mu \mathbf{1} = \begin{bmatrix} \mu \\ \mathbf{0}_{n} \end{bmatrix} $$ when taking the expected value of the above equation, and its covariance matrix $\Sigma$ is $$ \begin{align*} \Sigma =& \begin{bmatrix} \mathbf{v}^{T} \\ I - \mathbf{1} \mathbf{v}^{T} \end{bmatrix} \sigma^{2} I \begin{bmatrix} \mathbf{v}^{T} \\ I - \mathbf{1} \mathbf{v}^{T} \end{bmatrix}^{T} \\ =& \sigma^{2} \begin{bmatrix} 1/n & \mathbf{0}_{n}^{T} \\ \mathbf{0}_{n} & I - \mathbf{1} \mathbf{v}^{T} \end{bmatrix} \end{align*} $$. Here, it can be seen that $\overline{X}$ is independent of $\mathbf{Y}$, and $$ S^{2} = {{ 1 } \over { n-1 }} \sum_{k=1}^{n} \left( X_{k} - \overline{X} \right)^{2} = {{ 1 } \over { n-1 }} \mathbf{Y}^{T} \mathbf{Y} $$ therefore, $\overline{X} \perp S^{2}$ is true.
■
(c)
If it is said that $\displaystyle V = \sum_{i=1}^{n} \left( { {X_{i} - \mu } \over {\sigma} } \right) ^2 $, then since $\displaystyle { {X_{i} - \mu } \over {\sigma} } \sim N(0,1)$, it will be $V \sim \chi^2 (n)$,
$$ \begin{align*} V =& \sum_{i=1}^{n} \left( { {X_{i} - \mu } \over {\sigma} } \right) ^2 \\ =& \sum_{i=1}^{n} \left( { { ( X_{i} -\overline{X} ) + ( \overline{X} - \mu ) } \over {\sigma} } \right) ^2 \\ =& \sum_{i=1}^{n} \left( { { X_{i} -\overline{X} } \over {\sigma} } \right) ^2 + \left( { { \overline{X} - \mu } \over {\sigma / \sqrt{n} } } \right) ^2 \end{align*} $$
Where $$ \sum_{i=1}^{n} \left( { { X_{i} -\overline{X} } \over {\sigma} } \right) ^2 = { {n-1} \over {\sigma^2} } \sum_{i=1}^{n} { { ( X_{i} -\overline{X} ) ^ 2 } \over {n-1} } = (n-1) { {S^2} \over {\sigma^2} } $$
To summarize $$ V = (n-1) { {S^2} \over {\sigma^2} } + \left( { { \overline{X} - \mu } \over {\sigma / \sqrt{n} } } \right) ^2 $$
Since it is $V \sim \chi^2 (n)$, and by (a) of Student’s theorem $$ \left( { { \overline{X} - \mu } \over {\sigma / \sqrt{n} } } \right) \sim N(0,1) $$ The square of the standard normal distribution follows a chi-squared distribution, so $$ \left( { { \overline{X} - \mu } \over {\sigma / \sqrt{n} } } \right)^2 \sim \chi^2 (1) $$
In (b) of Student’s theorem, it was shown that $\overline{X}$ and $S^2$ are independent, so if both sides are put into a form of moment-generating functions, $$ (1-2t)^{-n/2} = E \left\{ \exp \left( (n-1) { {S^2} \over {\sigma^2} } t \right) \right\} (1-2t)^{-1/2} $$
Therefore, the moment-generating function of $\displaystyle (n-1) { {S^2} \over {\sigma^2} }$ is $(1-2t)^{-(n-1)/2}$
■
(d)
Derivation of Student’s t-distribution from normal and chi-squared distributions: If $W \sim N(0,1)$ is true and $V \sim \chi^2 (r)$, $$ T = { {W} \over {\sqrt{V/r} } } \sim t(r) $$
$$ T = { {\overline{X} - \mu } \over {S / \sqrt{n}} } = { {( \overline{X} - \mu ) / (\sigma / \sqrt{n}) } \over { \sqrt{ (n-1) S^2 / ( \sigma^2 ( n-1 ) ) } } } $$ In (a) of Student’s theorem, it was shown that $\displaystyle \overline{X} \sim N\left( \mu , { {\sigma^2} \over {n} } \right) $ and from (c) it was shown that $\displaystyle (n-1) { {S^2} \over {\sigma^2} } \sim \chi^2 (n-1) $, $$ T \sim t(n-1) $$
■
Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p195. ↩︎