Harke-Bera Test
Hypothesis Testing
Given that we have quantitative data $\left\{ x_{i} \right\}_{i = 1}^{n}$.
- $H_{0}$: Data $\left\{ x_{i} \right\}_{i = 1}^{n}$ follows a normal distribution.
- $H_{1}$: Data $\left\{ x_{i} \right\}_{i = 1}^{n}$ does not follow a normal distribution.
Explanation
The Jarque-Bera test is used to test for normality as a hypothesis test, typically to demonstrate the presence of normality. This is one of the rare cases where the acceptance of the null hypothesis matches the analyst’s intention, hence it is important to understand the hypothesis accurately.
The difference from the Shapiro-Wilk test is that it uses skewness and kurtosis for the test. The normal distribution has both a population skewness and population kurtosis of $0$, and the test statistic $JB$ based on the sample skewness $g_{1}$ and sample kurtosis $g_{2}$ follows a chi-squared distribution with degrees of freedom $2$.
$$
JB := {{n g_{1}^2} \over {6}} + {{n g_{2}^2} \over {24}} \sim \chi^{2} (2)
$$
Regardless, since it is a normality test, it doesn’t particularly matter which one you use, but the Jarque-Bera test uses skewness that is sensitive to outliers, often revealing normality after removing outliers more frequently compared to the Shapiro-Wilk test. Although it cannot be guaranteed for this reason alone, it is typically used in time series analysis rather than regression analysis to demonstrate normality. In practice, the jarque.bera.test()
function in the tseries
package of R performs the Jarque-Bera test.
Code
Exercise
Create the following two random samples to actually conduct the Jarque-Bera test.
N
is data from a normal distribution, and geo
is data from a geometric distribution.
The test results appear exactly as expected.
Complete Code
Below is an example code in R.
library(tseries)
set.seed(150421)
N<-rnorm(100)
win.graph(4,4); hist(N)
jarque.bera.test(N)
geo<-rgeom(100,0.5)
win.graph(4,4); hist(geo)
jarque.bera.test(geo)