Skewness in Mathematical Statistics


  1. When the mean of a random variable XX is μ\mu, and its variance is σ2\sigma^2, the following defined γ1\gamma_{1} is called the Skewness of XX. γ1:=E(Xμ)3σ3 \gamma_{1} := {{ E \left( X - \mu \right)^3 } \over { \sigma^3 }}
  2. When the sample mean of data {Xi}in\left\{ X_{i} \right\}_{i}^{n} is X\overline{X}, and the sample variance is σ^2\widehat{\sigma}^2, the sample skewness g1g_{1} is calculated as follows. g1:=i=1n(XX)3nσ^3 g_{1} := \sum_{i=1}^{n} {{ \left( X - \overline{X} \right)^3 } \over { n \widehat{\sigma}^3 }}


Skewness is calculated by the third moment and serves as a measure of how the distribution function of a random variable is skewed. A positive number means there are many large values on the right, and a negative number means there are many large values on the left.


The normal distribution has a skewness of 00, and indeed, when drawing 10001000 samples, it is found to be close to 00. Although the calculation itself resulted in a negative number, actually looking at the histogram also shows extreme values clustered on the left.


The histogram above is drawn from Pois(5)\text{Pois}(5) samples taken from the Poisson distribution. The fact that the calculation turned out positive indicates that indeed, many extreme values lie on the right side.

hist(x,main=paste0("N(0,1)의 왜도 : ",round(skewness(x),4)))
hist(y,main=paste0("Pois(5)의 왜도 : ",round(skewness(y),4)))