logo

Pearson Correlation Coefficient 📂Mathematical Statistics

Pearson Correlation Coefficient

Definition 1

For two random variables X,YX, Y, the following ρ=ρ(X,Y)\rho = \rho (X,Y), defined as the Pearson Correlation Coefficient, is: ρ=Cov(X,Y)σXσY \rho = { {\operatorname{Cov} (X,Y)} \over {\sigma_X \sigma_Y} }


Explanation

The Pearson Correlation Coefficient is a measure of whether two variables have a (linear) correlation. If close to 11 or 1–1, it is considered to have a correlation, and if 00, it is considered to have none.

It is important to note that correlation and independence are not the same concept. Correlation only checks if the two variables form a linear graph. Lack of correlation does not necessarily mean independence, but if they are independent, it can be said there’s no correlation. This reverse is only true when the two variables follow a normal distribution.

Properties

The Pearson correlation coefficient does not exceed [1,1][-1,1]. That is, 1ρ1 – 1 \le \rho \le 1

Proof

Two methods of proof will be introduced.

Proof using Cauchy-Schwarz inequality

ρ=Cov(X,Y)σXσY=1nk=1n(xkμXσX)(ykμYσY) \rho = { {\operatorname{Cov} (X,Y)} \over {\sigma_X \sigma_Y} } = {1 \over n} \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) } Squaring both sides gives ρ2=1n2{k=1n(xkμXσX)(ykμYσY)}2 \rho ^2 = {1 \over {n^2} } \left\{ \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) } \right\} ^ 2

Cauchy-Schwarz inequality: (a2+b2)(x2+y2)(ax+by)2 ({a}^{2}+{b}^{2})({x}^{2}+{y}^{2})\ge { (ax+by) }^{ 2 }

By the Cauchy-Schwarz inequality 1n2{k=1n(xkμXσX)(ykμYσY)}21n2k=1n(xkμXσX)2k=1n(ykμYσY)2 {1 \over {n^2} } \left\{ \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) } \right\} ^ 2 \le {1 \over {n^2} } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) ^ 2 } \sum_{k=1}^{n} { \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) ^ 2 } Rearranging the right side gives 1n2k=1n(xkμXσX)2k=1n(ykμYσY)2=1σX2σY2k=1n(xkμXn)2k=1n(ykμYn)2=1σX2σY2σX2σY2=1 \begin{align*} & {1 \over {n^2} } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over {\sigma_X} } \right) ^ 2 } \sum_{k=1}^{n} { \left( { { y_k - \mu_{Y} } \over {\sigma_Y} } \right) ^ 2 } \\ =& {1 \over { {\sigma_X}^2 {\sigma_Y}^2 } } \sum_{k=1}^{n} { \left( { { x_k - \mu_{X} } \over { \sqrt{n} } } \right) ^ 2 \sum_{k=1}^{n} \left( { { y_k - \mu_{Y} } \over {\sqrt{n}} } \right) ^ 2 } \\ =& {1 \over { {\sigma_X}^2 {\sigma_Y}^2 } } {\sigma_X}^2 {\sigma_Y}^2 \\ =& 1 \end{align*} Since ρ21\rho ^2 \le 1, 1ρ1 -1 \le \rho \le 1

Proof using the definition of covariance

Setting Var(Y)=σY2,Var(X)=σX2\Var(Y)={ \sigma _ Y }^2, \Var(X)={ \sigma _ X }^2, Z=YσYρXσX\displaystyle Z= \frac { Y }{ \sigma _Y } - \rho \frac { X }{ \sigma _X } to be the definition of covariance gives Var(Z)=1σY2Var(Y)+ρ2σX2Var(X)2ρσXσYCov(X,Y)=1σY2σY2+ρ2σX2σX22ρρ=1+ρ22ρ2=1ρ2 \begin{align*} \Var(Z)&=\frac { 1 }{ { \sigma _ Y }^2 }\Var(Y)+\frac { { \rho ^ 2 } }{ { \sigma _ X }^2 }\Var(X)-2\frac { \rho }{ { \sigma _X } { \sigma _Y } }\operatorname{Cov}(X,Y) \\ =& \frac { 1 }{ { \sigma _ Y }^2 }{ \sigma _ Y }^2+\frac { { \rho ^ 2 } }{ { \sigma _ X }^2 }{ \sigma _ X }^2-2\rho \cdot \rho \\ &=1+{ \rho ^ 2 }-2{ \rho ^ 2 } \\ &=1-{ \rho ^ 2 } \end{align*} Because of Var(Z)0\Var(Z)\ge 0, 1ρ20    ρ210    (ρ+1)(ρ1)0    1ρ1 \begin{align*} 1-{ \rho ^ 2 }\ge 0 \implies& { \rho ^ 2 }-1\le 0 \\ \implies& (\rho +1)(\rho –1)\le 0 \\ \implies& -1\le \rho \le 1 \end{align*}


  1. Hogg et al. (2013). Introduction to Mathematical Statistics (7th Edition): p104. ↩︎