In the case of simple regression analysis, it might not be difficult, but in multiple regression analysis, proofs involving matrix functions and partial derivatives appear in supporting theorems, making it more challenging than it seems. In fact, even when searching for proofs online, many times the part marked as ∑k=1nyk=∑k=1ny^k is often skipped, simply showing the intuition of the equation1. If you are just starting to learn regression analysis, it is recommended to acknowledge its existence and come back later for deeper understanding.
Part 1. SST=SSR+SSE+⋯
====SSTk=1∑n(yk−y)2k=1∑n(yk−y^k+y^k−y)2k=1∑n(y^k−y)2+k=1∑n(yk−y^k)2+2k=1∑n(yk−y^k)(y^k−y)SSR+SSE+2k=1∑n(yk−y^k)(y^k−y)
Thus, the last term
=k=1∑n(yk−y^k)(y^k−y)k=1∑n(yk−y^k)y^k−yk=1∑n(yk−y^k)
proving 0 completes the proof.
Part 2. y∑k=1n(yk−y^k)=0
Sum of fitted values in multiple regression analysis: If β^ is the best linear unbiased estimator, then the sum of yk and the sum of fitted values y^k=1=β^0+∑j=1pβ^jxj are equal:
k=1∑nyk=k=1∑ny^k
According to the auxiliary theorem, because of ∑k=1nyk=∑k=1ny^k, it should follow that ∑k=1n(yk−y^k)=0. While in this post it seems to be glossed over using the auxiliary theorem, this is actually a rather critical part. Make sure to thoroughly understand the proof of the auxiliary theorem.
Part 3. ∑k=1n(yk−y^k)y^k=0
y^1⋮y^n=Xβ
The fitted value vector y^1,⋯,y^n can be expressed as a product of X and β as shown above. Let us expand the expression involving the identity matrixE and the zero matrixO as follows.
===========k=1∑n(yk−y^k)y^k[y1−y^1⋯yn−y^n]y^1⋮y^n(YT−(Xβ)T)y^1⋮y^n(Y−Xβ)TXβ(Y−X(XTX)−1XTY)TXβ([E−X(XTX)−1XT]Y)TXβYT(E−X(XTX)−1XT)TXβYT(XT[E−X(XTX)−1XT])TβYT(XT−XTX(XTX)−1XT)TβYT(XT−XT)TβYTOTβ0
Consequently, we obtain the following equation.
SST=SSR+SSE