Easy Definition of P-Value or Significance Probability 📂Statistical Test

Easy Definition of P-Value or Significance Probability

Definition ¹

The probability of rejecting the null hypothesis in hypothesis testing is called the p-value.

Explanation

If the p-value is lower than the significance level, it is considered that the null hypothesis has been rejected. A small p-value under the null hypothesis can be understood as ’the evidence against the null hypothesis is too strong to be attributed to chance.'

When terms like power curve or rejection region are introduced, studying becomes difficult, and even with diligent study, it’s easy to get confused about null versus alternative hypotheses. Thus, it’s more efficient to grasp only the core concepts. Those are not that important in the long and arduous path of statistics.

In fact, it’s normal not to get a good sense of it even after explaining and studying thoroughly at an undergraduate 1~2 year level. When you become an upperclassman and start analyzing data, you will become proficient with p-values whether you like it or not. So, don’t be too disappointed if you don’t understand or are confused. p-values are not about learning but becoming familiar with them.

The p-value is a probability.

That is, $0 \le p \le 1$ . Those who don’t grasp this fact usually don’t understand the definition of p-value and think of it as some kind of coefficient in other scientific fields. P-values were introduced to say how reliable something is probabilistically, as hypothesis testing is not absolute.

A lower p-value does not mean a stronger rejection

This is an important fact. The null hypothesis is rejected if the significance level $\alpha$ is just $p \le \alpha$ .

One of the most common misconceptions about p-values is the belief that a smaller p-value means a stronger rejection of the null hypothesis. Of course, a smaller p-value means that the null hypothesis can be rejected at a lower significance level, but the ’extent’ of this doesn’t matter. Whether it’s $0.001$ or $10^{-8}$ , or just a very small positive $\varepsilon >0$ , as long as it’s smaller than the significance level, the rejection is the same.

There’s no good or bad in rejecting the null hypothesis

That is, there’s no good or bad in p-values being high or low. This is what students, who don’t study regularly but cram at the last minute, are most curious about, but there’s no answer. What’s good or bad is judged by the outcome alone, and the person handling the data doesn’t need and shouldn’t make such judgments.

For example, in a t-test or z-test, the test statistics are more unlikely towards the tail ends, so a larger absolute value leads to rejection. In the logistic regression Hosmer-Lemeshow goodness of fit test, the test statistic follows a chi-square distribution, and the closer it is to $0$ , the better the fit, so a larger value means rejection.

If you got the feeling of what has been rejected in the above two examples without explicitly mentioning what the null hypothesis is, then you can consider that you have understood p-values. At least in hypothesis testing, null and alternative hypotheses are determined like this based on the distribution of the test statistic. Rejection regions are determined with the principle of maximizing the rejection region, so this explanation of ‘unlikely events’ makes sense. If you imagine how the rejection region is determined at a given significance level, you will understand.

Although the areas under the distribution function of the above rejection regions are the same, the lengths on the $x$ axis differ significantly. If you’re asked to choose one of the three as the rejection region for a two-tailed test within common sense, everyone would choose the first.

Code

Below is the R code to draw the above picture.

win.graph(9,3)
 
par(mfrow=c(1,3))
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab=NA,ylab=NA)
 
polygon(c(seq(qt(0.975,14),5,0.01),qt(0.975,14)),
        c(dt(seq(qt(0.975,14),5,0.01),df=14),0),
        col='yellow',lty=0)
polygon(c(seq(-5,qt(0.025,14),0.01),qt(0.025,14)),
        c(dt(seq(-5,qt(0.025,14),0.01),df=14),0),
        col='yellow',lty=0)
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))
 
 
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab=NA,ylab=NA)
 
polygon(c(seq(qt(0.95,14),5,0.01),qt(0.95,14)),
        c(dt(seq(qt(0.95,14),5,0.01),df=14),0),
        col='yellow',lty=0)
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))
 
 
plot(0,0,type='n',xlim=c(-4,4),ylim=c(-0.08,0.4),xlab=NA,ylab=NA)
 
x<-c(seq(qt(0.50,14),qt(0.55,14),0.01))
y<-c(dt(seq(qt(0.50,14),qt(0.55,14),0.01),df=14))
x<-c(x[1],x,x[length(x)])
y<-c(0,y,0)
polygon(x,
        y,
        col='yellow',lty=0)
abline(h=0)
lines(seq(-5,5,0.01),dt(seq(-5,5,0.01),df=14))

경북대학교 통계학과. (2008). 엑셀을 이용한 통계학: p203. ↩︎