Let’s assume a random vectorX=(X1,⋯,Xp) is given. Considering the linear combination of random variablesX1,⋯,XpakTX=ak1X1+⋯+akpXp=l=1∑paklXl
where for a length 1unit vector is denoted as ak=(ak1,⋯,akp)∈Rp, the goal is to maximize the variance of the first a1TXVar(a1TX)
and, while satisfying Cov(a1TX,a2TX)=0, also maximize the variance of the second a2TXVar(a2TX)
and, for all l<k satisfying Cov(alTX,akTX)=0, also maximize the variance of the kth aiTXVar(akTX)
Vectors a1,⋯,ap that accomplish this, through which the data is analyzed, are known as Principal Component Analysis, PCA.
Principal Components
Assuming the covariance matrixΣ∈Rp×p of random vector X has its eigenpairs{(λk,ek)}k=1p arranged in a certain order λ1≥⋯≥λp≥0, and ek vectors are ∥ek∥=1, meaning they are normalized. The random variableYk defined by the inner product between the random vector X and the kth eigenvector ek is called the kth Principal Component.
Yk:=ekTX
The realization yk of Yk is called the kth PC Score.
Unit vectors that maximize the principal components in PCA are obtained as follows ak=ek. For k=1,⋯,p and i=j, the variance and covariance of the kth principal component are as follows.
Var(Yk)=Cov(Yi,Yj)=λk0
Proof
The covariance matrix Σ can be expanded as follows.
==ΣCov(X)Var(X1)Cov(X2,X1)⋮Cov(Xp,X1)Cov(X1,X2)Var(X2)⋮Cov(Xp,X2)⋯⋯⋱⋯Cov(X1,Xp)Cov(X2,Xp)⋮Var(Xp)
Defining the orthogonal matrixP:=[e1⋯ep]∈Rp×p composed of eigenvectors of Σ, the covariance of random vector Y:=PTX can be represented as follows according to the properties.
==Cov(Y)Cov(PTX)PTCov(X)P
By expanding the first diagonal component of this Cov(Y), it can be found to be equal to the first eigenvalue.
Var(Y1)===e1TVarXe1e1TΣe1λ1
Quadratic Forms and Eigenvalues of Positive-Definite Matrices: Assuming the eigenpair{(λk,ek)}k=1n of a positive-definite matrixA∈Rp×p is ordered like λ1≥⋯≥λn≥0, the maximum and minimum values of the quadratic form xTAx on the unit sphere are as follows.
∥x∥=1maxxTAx=∥x∥=1minxTAx=λ1λp, attained when x=e1, attained when x=ep
Meanwhile, for k=2,⋯,p−1, the following holds.
∥x∥=1x⊥e1,⋯,ek−1maxxTAx=λk, attained when x=ek
Eventually, this λ1 is the maximum value of the quadratic form xTΣx under the constraint ∥x∥=1. Summarizing,
Var(Y1)=λ1=∥x∥=1maxxTΣx
by the same theorem, the variance Var(Yk) of the kth is equal to the kth eigenvalue λk under the constraint that ∥x∥=1 and e1,⋯,ek−1 are orthogonal to each other. In other words,
Var(Yk)=λk
under the constraint condition x⊥e1,⋯,ek−1, the covariance of Yl,Yk is as follows.
Cov(Yl,Yk)=====elTΣekelTλkekλkelTekλk⋅00