logo

Proof of Cochran's Theorem 📂Mathematical Statistics

Proof of Cochran's Theorem

Theorem

Let Sample X=(X1,,Xn)\mathbf{X} = \left( X_{1} , \cdots , X_{n} \right) be iid and follow a Normal distribution like X1,,XniidN(0,σ2)X_{1} , \cdots , X_{n} \overset{\text{iid}}{\sim} N \left( 0, \sigma^{2} \right). For a symmetric matrix A1,,AkRn×nA_{1} , \cdots , A_{k} \in \mathbb{R}^{n \times n} with rank rjr_{j}, suppose the random variable Q1,,QkQ_{1} , \cdots , Q_{k} is expressed as a random vector quadratic form Qi:=XTAiXQ_{i} := \mathbf{X}^{T} A_{i} \mathbf{X}, and the sum of squares of the sample is given as i=1nXi2=j=1kQj\sum_{i=1}^{n} X_{i}^{2} = \sum_{j=1}^{k} Q_{j}. Then the following holds. j,Qjσ2χ2(rj)j1j2,Qj1Qj2    j=1krj=n \forall j , {\frac{ Q_{j} }{ \sigma^{2} }} \sim \chi^{2} \left( r_{j} \right) \land \forall j_{1} \ne j_{2} , Q_{j_{1}} \perp Q_{j_{2}} \iff \sum_{j=1}^{k} r_{j} = n In other words, QjQ_{j} are mutually independent and their equivalence condition with the Chi-square distribution χ2(rj)\chi^{2} \left( r_{j} \right) implies that the sum of ranks rjr_{j} equals the size of the sample nn.

Explanation

This theorem provides the theoretical framework supporting the analysis of variance where F-test is used.

Proof

Assume that (    )(\implies) and QjQ_{j} are mutually independent and that Qj/σ2χ2(rj)Q_{j} / \sigma^{2} \sim \chi^{2} \left( r_{j} \right) holds.

Addition of random variables: If Xiχ2(ri)X_i \sim \chi^2 ( r_{i} ) then i=1nXiχ2(i=1nri) \sum_{i=1}^{n} X_{i} \sim \chi ^2 \left( \sum_{i=1}^{n} r_{i} \right)

Since Qj/σ2Q_{j} / \sigma^{2} follows a Chi-square distribution with degree of freedom rjr_{j}, the sum of these variables also follows a Chi-square distribution as below. j=1kQjσ2χ2(j=1krj) \sum_{j=1}^{k} {\frac{ Q_{j} }{ \sigma^{2} }} \sim \chi^{2} \left( \sum_{j=1}^{k} r_{j} \right)

Derivation of the Chi-square distribution from the standard normal distribution: If XN(μ,σ2)X \sim N(\mu,\sigma ^2) then V=(Xμσ)2χ2(1) V=\left( { X - \mu \over \sigma} \right) ^2 \sim \chi ^2 (1)

Since X1,,XnX_{1} , \cdots , X_{n} follows a Normal distribution, Xi2/σ2χ2(1)X_{i}^{2} / \sigma^{2} \sim \chi^{2} \left( 1 \right) holds, and their sum follows a Chi-square distribution as shown below. i=1nXi2σ2χ2(n) \sum_{i=1}^{n} {\frac{ X_{i}^{2} }{ \sigma^{2} }} \sim \chi^{2} \left( n \right)

Given that i=1nXi2=j=1kQj\sum_{i=1}^{n} X_{i}^{2} = \sum_{j=1}^{k} Q_{j} was stated in the major premise, n=j=1krjn = \sum_{j=1}^{k} r_{j} must hold.


Assume that (    )(\impliedby) and j=1krj=n\sum_{j=1}^{k} r_{j} = n hold.

j=1kQj=XT(A1++Ak)X=XTX=i=1nXi2 \begin{align*} \sum_{j=1}^{k} Q_{j} =& \mathbf{X}^{T} \left( A_{1} + \cdots + A_{k} \right) \mathbf{X} \\ =& \mathbf{X}^{T} \mathbf{X} \\ =& \sum_{i=1}^{n} X_{i}^{2} \end{align*} Since i=1nXi2=j=1kQj\sum_{i=1}^{n} X_{i}^{2} = \sum_{j=1}^{k} Q_{j} holds as discussed above, it can be noticed that In=j=1kAjI_{n} = \sum_{j=1}^{k} A_{j} is true. By defining the matrix Bj=InAjB_{j} = I_{n} - A_{j}, BjB_{j} is equal to the sum excluding AjA_{j}, leaving the remainder A1,,AkA_{1} , \cdots , A_{k}.

Subadditivity of matrix rank: The rank of a matrix exhibits subadditivity. That is, for two matrices A,BA, B, the following holds. rank(A+B)rankA+rankB \rank \left( A + B \right) \le \rank A + \rank B

By defining the rank of Rj0R_{j_{0}} as Bj0B_{j_{0}}, the rank of the matrix sum is less than or equal to the sum of the individual ranks, yielding the following inequality. Rj0=rankBj0rank(InAj0)=j=1krjrj0=nrj0 R_{j_{0}} = \rank B_{j_{0}} \le \rank \left( I_{n} - A_{j_{0}} \right) = \sum_{j=1}^{k} r_{j} - r_{j_{0}} = n - r_{j_{0}} However, considering that In=Aj0+Bj0I_{n} = A_{j_{0}} + B_{j_{0}}, it follows that nrj0+Rj0    nrj0Rj0n \le r_{j_{0}} + R_{j_{0}} \implies n - r_{j_{0}} \le R_{j_{0}}, and exactly Rj0=nrj0R_{j_{0}} = n - r_{j_{0}} holds.

This implies that Bj0B_{j_{0}} has exactly 00 non-zero eigenvalues. The eigenvalues λ\lambda of Bj0B_{j_{0}} must satisfy det(Bj0λI)=0\det \left( B_{j_{0}} - \lambda I \right) = 0, allowing us to rewrite it in the following manner given Bj0=InAj0B_{j_{0}} = I_{n} - A_{j_{0}}. det(InAj0λIn)=det(Aj0(1λ)In)=0 \det \left( I_{n} - A_{j_{0}} - \lambda I_{n} \right) = \det \left( A_{j_{0}} - \left( 1 - \lambda \right) I_{n} \right) = 0 Thus, the eigenvalues of Aj0A_{j_{0}} differ by 11 from the eigenvalues of Bj0B_{j_{0}}, which exactly counted rj0r_{j_{0}} zero eigenvalues; hence Aj0A_{j_{0}} has exactly rj0r_{j_{0}} eigenvalues of 11, and the rest are all 00.

Symmetric real matrix with only eigenvalues 00 and 11: If a symmetric matrix ARn×nA \in \mathbb{R}^{n \times n} has all eigenvalues as 00 or 11, then AA is an idempotent matrix.

Conditions for the chi-square distribution of quadratic forms of normal random vectors: Let sample X=(X1,,Xn)\mathbf{X} = \left( X_{1} , \cdots , X_{n} \right) be iid following a normal distribution as X1,,XniidN(0,σ2)X_{1} , \cdots , X_{n} \overset{\text{iid}}{\sim} N \left( 0, \sigma^{2} \right). For a symmetric matrix ARn×nA \in \mathbb{R}^{n \times n} with rank rnr \le n, if we define the random vector quadratic form as Q=σ2XTAXQ = \sigma^{-2} \mathbf{X}^{T} A \mathbf{X}, the following holds. Qχ2(r)    A2=A Q \sim \chi^{2} (r) \iff A^{2} = A

All symmetric real matrices A1,,AkA_{1} , \cdots , A_{k} are idempotent matrices since their eigenvalues are only 00 and 11, and since their rank is rjr_{j}, Qj/σ2Q_{j} / \sigma^{2} follows a Chi-square distribution χ2(rj)\chi^{2} \left( r_{j} \right).

Hogg-Craig theorem: Let sample X=(X1,,Xn)\mathbf{X} = \left( X_{1} , \cdots , X_{n} \right) be iid following a normal distribution as X1,,XniidN(0,σ2)X_{1} , \cdots , X_{n} \overset{\text{iid}}{\sim} N \left( 0, \sigma^{2} \right). For a symmetric matrix A1,,AkRn×nA_{1} , \cdots , A_{k} \in \mathbb{R}^{n \times n}, assume that the random variable Q1,,QkQ_{1} , \cdots , Q_{k} is expressed as a random vector quadratic form Qi:=XTAiXQ_{i} := \mathbf{X}^{T} A_{i} \mathbf{X}, define the symmetric matrix AA and random variable QQ as follows. A=A1++AkQ=Q1++Qk \begin{align*} A =& A_{1} + \cdots + A_{k} \\ Q =& Q_{1} + \cdots + Q_{k} \end{align*} If Q/σ2Q / \sigma^{2} follows a Chi-square distribution χ2(r)\chi^{2} \left( r \right), satisfies i=1,,k1i = 1 , \cdots , k-1 for Qi/σ2χ2(ri)Q_{i} / \sigma^{2} \sim \chi^{2} \left( r_{i} \right), and if Qk0Q_{k} \ge 0 holds, then Q1,,QkQ_{1} , \cdots , Q_{k} is independent and Qk/σ2Q_{k} / \sigma^{2} follows a Chi-square distribution χ2(rk)\chi^{2} \left( r_{k} \right) with degrees of freedom rk=rr1rk1r_{k} = r - r_{1} - \cdots - r_{k-1}.

By the Hogg-Craig theorem, Q1,,QkQ_{1} , \cdots , Q_{k} are mutually independent.