Rejection Region and Significance Level 📂Statistical Test

Rejection Region and Significance Level

Definition ¹

The error of rejecting the null hypothesis when it is actually true is called a Type I error.
The error of failing to reject the null hypothesis when the alternative hypothesis is true is called a Type II error.
The maximum probability of committing a Type I error is called the Significance Level.
The statistic used for hypothesis testing is called the test statistic.
The region of the test statistic’s observed values that leads to the rejection of the null hypothesis is called the Rejection Region.

Description

No matter how much data is accumulated and how sophisticated mathematical techniques are applied, it all means nothing if it can’t be used. Here, ’to use’ means to derive statistics from some data and make a ‘claim’ based on those statistics. Naturally, those statistics must be reliable, and the answer to the question of who and by what standard will judge them is hypothesis testing.

Example

$20180404\_143936.png$

Let’s consider the data above as the average midterm scores of classes 1 through 15 of the Science Department, Gangbuk High School, Grade 3. At a glance, the average of class 15 is exceptionally high, and standardization makes this even clearer. However, while ranking or comparing whether scores are above or below the overall average is easy, it’s difficult to judge ‘how much’ better or worse someone is. Just a small difference in averages isn’t enough to conclusively say who is better; it’s like measuring acorns. There must be a certain extent from which one can say ‘it’s a different level,’ but where that line is can be vague.

Consider that the Z-score follows a t-distribution with degrees of freedom $14$.

If we display the probability density function of $t_{14}$ and the distribution of means together, it appears as shown in the image above. The mean of the Z-score is $0$, and a Z-score close to $0$ means that the original data is ’not far from the mean.’ On the other hand, original data with a Z-score far from $0$, whether high or low, is difficult to say is similar to the average.

The area colored in yellow, combining both sides, is $\color{red}{0.05}$, indicating the probability of data falling within that range is $\color{red}{0.05}$. Data belonging here is theoretically a very rare case with a probability of $\color{red}{5 \%}$, and the disparity is too significant to be attributed merely to chance. Therefore, if the average score is higher while being different from the mean, maybe it’s not just by chance but because of superior skill.

Returning to our example, it’s abnormal for the entire class 15 to have merely ‘happened’ to score well on the test. If the null hypothesis $H_{0}$ is ‘The average of class 15 does not significantly differ from the overall grade 3 average’, we can reject this null hypothesis. The yellow-colored area being the ‘rejection region’ is hence called the Rejection Region. Furthermore, the area’s size when deciding this region is ’to what extent it is meaningful,’ hence called the Significance Probability. In short, hypothesis testing supports the statement ‘It’s hard to attribute it to just coincidence’ with statistics. The judgment comes down to whether it falls within the rejection region, and the criterion for this is the significance level.

Understanding the rejection region and significance level as concepts is more important than their exact definitions. Ignoring them because they seem barely relevant or unnecessary can lead to forgetting these fundamental concepts when they’re most needed.

R Code

Below is the R code used in this post.

set.seed(150421);
avg<-signif(6*rnorm(15)+60,3); names(avg)<-paste0('(',(1:15),')'); avg
Z = scale(avg)[,1]; Z
 
win.graph()
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab='Z-score\',ylab='t',main='중간고사 결과')
 
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))
 
points(x=Z,y=rep(0,15),pch=16)
text(x=Z,-0.05,labels=paste0('(',(1:15),')'))
arrows(Z,-0.04,Z,-0.005,length=0.1)
 
 
 
win.graph()
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab='Z-score\',ylab='t',main='중간고사 결과')
 
polygon(c(seq(qt(0.975,14),5,0.01),qt(0.975,14)),
        c(dt(seq(qt(0.975,14),5,0.01),df=14),0),
        col='yellow',lty=0)
polygon(c(seq(-5,qt(0.025,14),0.01),qt(0.025,14)),
        c(dt(seq(-5,qt(0.025,14),0.01),df=14),0),
        col='yellow',lty=0)
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))
 
 
win.graph()
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab='Z-score\',ylab='t',main='중간고사 결과')
 
polygon(c(seq(qt(0.975,14),5,0.01),qt(0.975,14)),
  c(dt(seq(qt(0.975,14),5,0.01),df=14),0),
  col='yellow',lty=0)
polygon(c(seq(-5,qt(0.025,14),0.01),qt(0.025,14)),
  c(dt(seq(-5,qt(0.025,14),0.01),df=14),0),
  col='yellow',lty=0)
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))
 
points(x=Z,y=rep(0,15),pch=16)
text(x=Z,-0.05,labels=paste0('(',(1:15),')'))
arrows(Z,-0.04,Z,-0.005,length=0.1)