Pearson Correlation Coefficient
📂Mathematical Statistics Pearson Correlation Coefficient Definition For two random variables X , Y X, Y X , Y , the following ρ = ρ ( X , Y ) \rho = \rho (X,Y) ρ = ρ ( X , Y ) , defined as the Pearson Correlation Coefficient , is:
ρ = Cov ( X , Y ) σ X σ Y
\rho = { {\operatorname{Cov} (X,Y)} \over {\sigma_X \sigma_Y} }
ρ = σ X σ Y Cov ( X , Y )
Explanation The Pearson Correlation Coefficient is a measure of whether two variables have a (linear) correlation . If close to 1 1 1 or – 1 –1 –1 , it is considered to have a correlation, and if 0 0 0 , it is considered to have none.
It is important to note that correlation and independence are not the same concept. Correlation only checks if the two variables form a linear graph. Lack of correlation does not necessarily mean independence , but if they are independent, it can be said there’s no correlation. This reverse is only true when the two variables follow a normal distribution .
Properties The Pearson correlation coefficient does not exceed [ − 1 , 1 ] [-1,1] [ − 1 , 1 ] . That is,
– 1 ≤ ρ ≤ 1
– 1 \le \rho \le 1
–1 ≤ ρ ≤ 1
Proof Two methods of proof will be introduced.
Proof using Cauchy-Schwarz inequality ρ = Cov ( X , Y ) σ X σ Y = 1 n ∑ k = 1 n ( x k − μ X σ X ) ( y k − μ Y σ Y )
\rho = { {\operatorname{Cov} (X,Y)} \over {\sigma_X \sigma_Y} } = {1 \over n} \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) }
ρ = σ X σ Y Cov ( X , Y ) = n 1 k = 1 ∑ n ( σ X x k − μ X ) ( σ Y y k − μ Y )
Squaring both sides gives
ρ 2 = 1 n 2 { ∑ k = 1 n ( x k − μ X σ X ) ( y k − μ Y σ Y ) } 2
\rho ^2 = {1 \over {n^2} } \left\{ \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) } \right\} ^ 2
ρ 2 = n 2 1 { k = 1 ∑ n ( σ X x k − μ X ) ( σ Y y k − μ Y ) } 2
Cauchy-Schwarz inequality :
( a 2 + b 2 ) ( x 2 + y 2 ) ≥ ( a x + b y ) 2
({a}^{2}+{b}^{2})({x}^{2}+{y}^{2})\ge { (ax+by) }^{ 2 }
( a 2 + b 2 ) ( x 2 + y 2 ) ≥ ( a x + b y ) 2
By the Cauchy-Schwarz inequality
1 n 2 { ∑ k = 1 n ( x k − μ X σ X ) ( y k − μ Y σ Y ) } 2 ≤ 1 n 2 ∑ k = 1 n ( x k − μ X σ X ) 2 ∑ k = 1 n ( y k − μ Y σ Y ) 2
{1 \over {n^2} } \left\{ \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) } \right\} ^ 2 \le {1 \over {n^2} } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) ^ 2 } \sum_{k=1}^{n} { \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) ^ 2 }
n 2 1 { k = 1 ∑ n ( σ X x k − μ X ) ( σ Y y k − μ Y ) } 2 ≤ n 2 1 k = 1 ∑ n ( σ X x k − μ X ) 2 k = 1 ∑ n ( σ Y y k − μ Y ) 2
Rearranging the right side gives
1 n 2 ∑ k = 1 n ( x k − μ X σ X ) 2 ∑ k = 1 n ( y k − μ Y σ Y ) 2 = 1 σ X 2 σ Y 2 ∑ k = 1 n ( x k − μ X n ) 2 ∑ k = 1 n ( y k − μ Y n ) 2 = 1 σ X 2 σ Y 2 σ X 2 σ Y 2 = 1
\begin{align*}
& {1 \over {n^2} } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) ^ 2 } \sum_{k=1}^{n} { \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) ^ 2 }
\\ =& {1 \over { {\sigma_X}^2 {\sigma_Y}^2 } } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over { \sqrt{n} } } \right) ^ 2 \sum_{k=1}^{n} \left( { { y_k - \mu_{Y} } \over {\sqrt{n}} } \right) ^ 2 }
\\ =& {1 \over { {\sigma_X}^2 {\sigma_Y}^2 } } {\sigma_X}^2 {\sigma_Y}^2
\\ =& 1
\end{align*}
= = = n 2 1 k = 1 ∑ n ( σ X x k − μ X ) 2 k = 1 ∑ n ( σ Y y k − μ Y ) 2 σ X 2 σ Y 2 1 k = 1 ∑ n ( n x k − μ X ) 2 k = 1 ∑ n ( n y k − μ Y ) 2 σ X 2 σ Y 2 1 σ X 2 σ Y 2 1
Since ρ 2 ≤ 1 \rho ^2 \le 1 ρ 2 ≤ 1 ,
− 1 ≤ ρ ≤ 1
-1 \le \rho \le 1
− 1 ≤ ρ ≤ 1
■
Proof using the definition of covariance Setting Var ( Y ) = σ Y 2 , Var ( X ) = σ X 2 \Var(Y)={ \sigma _ Y }^2, \Var(X)={ \sigma _ X }^2 Var ( Y ) = σ Y 2 , Var ( X ) = σ X 2 , Z = Y σ Y − ρ X σ X \displaystyle Z= \frac { Y }{ \sigma _Y } - \rho \frac { X }{ \sigma _X } Z = σ Y Y − ρ σ X X to be the definition of covariance gives
Var ( Z ) = 1 σ Y 2 Var ( Y ) + ρ 2 σ X 2 Var ( X ) − 2 ρ σ X σ Y Cov ( X , Y ) = 1 σ Y 2 σ Y 2 + ρ 2 σ X 2 σ X 2 − 2 ρ ⋅ ρ = 1 + ρ 2 − 2 ρ 2 = 1 − ρ 2
\begin{align*}
\Var(Z)&=\frac { 1 }{ { \sigma _ Y }^2 }\Var(Y)+\frac { { \rho ^ 2 } }{ { \sigma _ X }^2 }\Var(X)-2\frac { \rho }{ { \sigma _X } { \sigma _Y } }\operatorname{Cov}(X,Y)
\\ =& \frac { 1 }{ { \sigma _ Y }^2 }{ \sigma _ Y }^2+\frac { { \rho ^ 2 } }{ { \sigma _ X }^2 }{ \sigma _ X }^2-2\rho \cdot \rho
\\ &=1+{ \rho ^ 2 }-2{ \rho ^ 2 }
\\ &=1-{ \rho ^ 2 }
\end{align*}
Var ( Z ) = = σ Y 2 1 Var ( Y ) + σ X 2 ρ 2 Var ( X ) − 2 σ X σ Y ρ Cov ( X , Y ) σ Y 2 1 σ Y 2 + σ X 2 ρ 2 σ X 2 − 2 ρ ⋅ ρ = 1 + ρ 2 − 2 ρ 2 = 1 − ρ 2
Because of Var ( Z ) ≥ 0 \Var(Z)\ge 0 Var ( Z ) ≥ 0 ,
1 − ρ 2 ≥ 0 ⟹ ρ 2 − 1 ≤ 0 ⟹ ( ρ + 1 ) ( ρ – 1 ) ≤ 0 ⟹ − 1 ≤ ρ ≤ 1
\begin{align*}
1-{ \rho ^ 2 }\ge 0 \implies& { \rho ^ 2 }-1\le 0
\\ \implies& (\rho +1)(\rho –1)\le 0
\\ \implies& -1\le \rho \le 1
\end{align*}
1 − ρ 2 ≥ 0 ⟹ ⟹ ⟹ ρ 2 − 1 ≤ 0 ( ρ + 1 ) ( ρ –1 ) ≤ 0 − 1 ≤ ρ ≤ 1
■