y1y2⋮yn=11⋮1x11x12⋮x1n⋯⋯⋱⋯xp1xp2⋮xpnβ0β1⋮βp+ε1ε2⋮εn
Given p independent variables and ndata points, the linear multiple regression model can be represented by a design matrix as shown above, which we simply denote as Y=Xβ+ε. It is assumed that the residuals exhibit homoscedasticity, independence, and normality, meaning
ε1,⋯,εn∼iidN(0,σ2)⟺ε∼Nn(0,σ2In)
under this assumption, the estimated regression coefficientsβ^=(β^0,β^1,⋯,β^p)=(XTX)−1XTY
follow a multivariate normal distribution.
β^∼N1+p(β,σ2(XTX)−1)
Moreover, β^ is the best linear unbiased estimator of β, hence it is also called the Best Linear Unbiased Estimator (BLUE).
The fact that the vector of regression coefficients follows a multivariate normal distribution is especially important for hypothesis testing related to regression coefficients, and it requires diagnosing the homoscedasticity, independence, and normality of the residuals.
Strategy: There is not really a strategy per se as everything is easily derived assuming normality of the residuals. Knowing that X and Y are not random variables but data fixed in a matrix form, i.e., constants, it’s all just matrix calculations.
β^===(XTX)−1XTY(XTX)−1XT(Xβ+ε)I1+pβ+(XTX)−1XTε
Thus, since β^ is a linear transformation of ε and assuming that ε follows a multivariate normal distribution, β^ also follows a multivariate normal distribution.
■
Mean
Eβ^=====E[I1+pβ+(XTX)−1XTε]EI1+pβ+E[(XTX)−1XTε]EI1+pβ+(XTX)−1XTEεEβ0β1⋮βp+0β0β1⋮βp
Consequently, β^ is also an unbiased estimator of β.
Varβ^======Var[I1+pβ+(XTX)−1XTε]Var[(XTX)−1XTε](XTX)−1XT(Varε)((XTX)−1XT)T(XTX)−1XTσ2I1+pX(XTX)−1σ2(XTX)−1XTX(XTX)−1σ2(XTX)−1
Meanwhile, since β^ is derived through the least squares method, there does not exist an unbiased estimator of β with a smaller variance, thus it is the best unbiased estimator.