logo

Kurtosis in Mathematical Statistics 📂Mathematical Statistics

Kurtosis in Mathematical Statistics

Kurtosis

  1. Given a random variable XX with mean μ\mu and variance σ2\sigma^2, the kurtosis of XX is defined as follows: γ2:=E(Xμ)4σ4 \gamma_{2} := {{ E \left( X - \mu \right)^4 } \over { \sigma^4 }}
  2. For data {Xi}in\left\{ X_{i} \right\}_{i}^{n}, with sample mean X\overline{X} and sample variance σ^2\widehat{\sigma}^2, sample kurtosis g2g_{2} is obtained as follows: g2:=i=1n(XX)4nσ^4 g_{2} := \sum_{i=1}^{n} {{ \left( X - \overline{X} \right)^4 } \over { n \widehat{\sigma}^4 }}

Standard with Normal Distribution

Normal distribution has a kurtosis of 33 regardless of its parameters, and this is often used as a standard. It’s common to use (γ23)\left( \gamma_{2} - 3 \right) and (g23)\left( g_{2} - 3 \right) despite the bias, to intuitively determine whether a probability distribution or data has fatter or thinner tails than a normal distribution.

Explanation

Kurtosis is calculated based on the 4th moment, and it measures how peaked a distribution function of a random variable is. Positive kurtosis indicates a flatter shape whereas negative kurtosis indicates a more peaked one.

N.png

The screenshot above shows the probability density function of a normal distribution N(0,1)N(0,1) and the calculation from drawing 10001000 samples. The kurtosis of a normal distribution is 00, and it was calculated to be close to 00 in reality.

cauchy.png

The screenshot above shows the probability density function of a Cauchy distribution C(0,1)C(0,1) and the calculation from drawing 10001000 samples. The Cauchy distribution does not have a defined mean or kurtosis, but the sample kurtosis was calculated to be close to 992992. Compared to the probability density function of a normal distribution, it has fatter tails on both ends, aligning with the explanation provided above.

Kurtosis is a Measure of Tails

The term Kurtosis originates from the Greek word κυρτός, meaning curved or arching1, and the translation as kurtosis might seem to refer to the pointedness of a peak, but subsequent research and intuition show that its actual value is more closely related to ‘how fat the tails of the distribution are’.

Code

Here is the R code used for generating the illustrated figures.

set.seed(150421)
win.graph(6,4)
x<-rnorm(1000)
plot(dnorm,xlim=c(-3,3),ylim=c(0,0.4),
     main=paste0("N(0,1)의 첨도 : ",round(kurtosis(x),4)))
abline(h=0)
win.graph(6,4)
y<-rcauchy(1000)
plot(dcauchy,xlim=c(-3,3),ylim=c(0,0.4),
     main=paste0("C(0,1)의 첨도 : ",round(kurtosis(y),4)))
abline(h=0)