In multiple regression analysis, when establishing the above linear model for given p independent variables X1,⋯,Xp, β0,β1,⋯,βp is called the regression coefficient. Y represents the dependent variable, and ε indicates randomly distributed errors.
Given ndata points and denoting it as p<n, the linear multiple regression model can be represented in a design matrix as above, and we simply denote it as Y=Xβ+ε. The least squaresestimator vectorβ^ for β is as follows.
Moreover, since β^ is the best unbiased estimator of β, it is also known as the Best Linear Unbiased Estimator.
Our goal is
to minimize. Therefore, finding a β that minimizes εTε=(Y−Xβ)T(Y−Xβ) follows from ε=Y−Xβ.
By differentiating both sides concerning β∂β∂εTε===−2XT(Y−Xβ)−2XT(Y−Xβ)−2XTY+2XTXβ
results in a zero vector0 for β^ like this.
Meanwhile, it can be easily shown that β^ is an unbiased estimator for β, and since it is derived via the least squares method, there is no unbiased estimator with smaller variance for β, making it the best unbiased estimator.
If differentiating concerning β in the derivation is not particularly appealing, an alternative approach using matrix algebra is available. In least squares method in matrix algebra,
the fact that β^ satisfies being the least squares solution shows that since X∈Rn×p holds, X∗=XT follows, and consequently, β^=(XTX)−1XTY is derived.
If β^ is the best linear unbiased estimator, then the sum of yk and the sum of fits y^k=1=β^0+∑j=1pβ^jxj are equal:
In proving this formula, that β^ is the best linear unbiased estimator means
that this holds. Since X is the design matrix, the first row of XT can be regarded as all elements being 1 in a row matrix of ones. Considering only the product of the first row of XT with Y−Xβ^ yields the following.
Consequently, the following is obtained.