logo

Kurtosis in Mathematical Statistics 📂Mathematical Statistics

Kurtosis in Mathematical Statistics

Kurtosis

  1. Given a random variable $X$ with mean $\mu$ and variance $\sigma^2$, the kurtosis of $X$ is defined as follows: $$ \gamma_{2} := {{ E \left( X - \mu \right)^4 } \over { \sigma^4 }} $$
  2. For data $\left\{ X_{i} \right\}_{i}^{n}$, with sample mean $\overline{X}$ and sample variance $\widehat{\sigma}^2$, sample kurtosis $g_{2}$ is obtained as follows: $$ g_{2} := \sum_{i=1}^{n} {{ \left( X - \overline{X} \right)^4 } \over { n \widehat{\sigma}^4 }} $$

Standard with Normal Distribution

Normal distribution has a kurtosis of $3$ regardless of its parameters, and this is often used as a standard. It’s common to use $\left( \gamma_{2} - 3 \right)$ and $\left( g_{2} - 3 \right)$ despite the bias, to intuitively determine whether a probability distribution or data has fatter or thinner tails than a normal distribution.

Explanation

Kurtosis is calculated based on the 4th moment, and it measures how peaked a distribution function of a random variable is. Positive kurtosis indicates a flatter shape whereas negative kurtosis indicates a more peaked one.

N.png

The screenshot above shows the probability density function of a normal distribution $N(0,1)$ and the calculation from drawing $1000$ samples. The kurtosis of a normal distribution is $0$, and it was calculated to be close to $0$ in reality.

cauchy.png

The screenshot above shows the probability density function of a Cauchy distribution $C(0,1)$ and the calculation from drawing $1000$ samples. The Cauchy distribution does not have a defined mean or kurtosis, but the sample kurtosis was calculated to be close to $992$. Compared to the probability density function of a normal distribution, it has fatter tails on both ends, aligning with the explanation provided above.

Kurtosis is a Measure of Tails

The term Kurtosis originates from the Greek word κυρτός, meaning curved or arching1, and the translation as kurtosis might seem to refer to the pointedness of a peak, but subsequent research and intuition show that its actual value is more closely related to ‘how fat the tails of the distribution are’.

Code

Here is the R code used for generating the illustrated figures.

set.seed(150421)
win.graph(6,4)
x<-rnorm(1000)
plot(dnorm,xlim=c(-3,3),ylim=c(0,0.4),
     main=paste0("N(0,1)의 첨도 : ",round(kurtosis(x),4)))
abline(h=0)
win.graph(6,4)
y<-rcauchy(1000)
plot(dcauchy,xlim=c(-3,3),ylim=c(0,0.4),
     main=paste0("C(0,1)의 첨도 : ",round(kurtosis(y),4)))
abline(h=0)