Skewness in Mathematical Statistics
Definition
- When the mean of a random variable $X$ is $\mu$, and its variance is $\sigma^2$, the following defined $\gamma_{1}$ is called the Skewness of $X$. $$ \gamma_{1} := {{ E \left( X - \mu \right)^3 } \over { \sigma^3 }} $$
- When the sample mean of data $\left\{ X_{i} \right\}_{i}^{n}$ is $\overline{X}$, and the sample variance is $\widehat{\sigma}^2$, the sample skewness $g_{1}$ is calculated as follows. $$ g_{1} := \sum_{i=1}^{n} {{ \left( X - \overline{X} \right)^3 } \over { n \widehat{\sigma}^3 }} $$
Explanation
Skewness is calculated by the third moment and serves as a measure of how the distribution function of a random variable is skewed. A positive number means there are many large values on the right, and a negative number means there are many large values on the left.
The normal distribution has a skewness of $0$, and indeed, when drawing $1000$ samples, it is found to be close to $0$. Although the calculation itself resulted in a negative number, actually looking at the histogram also shows extreme values clustered on the left.
The histogram above is drawn from $\text{Pois}(5)$ samples taken from the Poisson distribution. The fact that the calculation turned out positive indicates that indeed, many extreme values lie on the right side.
set.seed(150421)
win.graph(6,4)
x<-rnorm(1000)
hist(x,main=paste0("N(0,1)의 왜도 : ",round(skewness(x),4)))
win.graph(6,4)
y<-rpois(1000,lambda=5)
hist(y,main=paste0("Pois(5)의 왜도 : ",round(skewness(y),4)))