logo

Hypothesis Testing Through Bayesian Factors 📂Mathematical Statistics

Hypothesis Testing Through Bayesian Factors

Buildup

To be able to use classical hypothesis testing, one must have a mathematical understanding of concepts such as rejection region, p-value, and even a statistical sense intuitive enough to understand them. It is no surprise that many students, even at the freshman college level, spend hours being taught and still fail to properly understand hypothesis testing. It is similar to how many students learn statistics in high school, find problem-solving easy, yet do not grasp its true meaning.

Hypothesis Testing 1

On the other hand, Bayesian statistics allows for very easy hypothesis testing through something called the Bayes Factor.

Let’s assume null hypothesis and alternative hypothesis are given as H0H_{0} vs H1H_{1}.

  1. π0,π1\pi_{0}, \pi_{1} is called prior information for each null hypothesis and alternative hypothesis respectively.
  2. α0,α1\alpha_{0}, \alpha_{1} is called posterior information for each null hypothesis and alternative hypothesis respectively.
  3. B01:=α0/α1π0/π1=α0/π0α1/π1\displaystyle B_{01} := {{ \alpha_{0 } / \alpha_{1} } \over { \pi_{0 } / \pi_{1} }} = {{ \alpha_{0 } / \pi_{0} } \over { \alpha_{1 } / \pi_{1} }} is called the Bayes factor supporting H0H_{0}.

Looking closely at the Bayes factor B01=α0π1 B_{01} = {{ \displaystyle {{ \alpha_{0} } \over { \cdot }} } \over { \displaystyle {{ \cdot } \over { \pi_{1} }} }} in \cdot, α1\alpha_{1} and π0\pi_{0} can freely enter each position. Therefore, there is no need to memorize the formula in a complicated way, just remember that α0\alpha_{0} is at the very top and π1\pi_{1} is at the very bottom.

In Bayesian analysis, hypothesis testing is simple: if B01B_{01} is greater than 11, it supports the null hypothesis; if smaller, it supports the alternative hypothesis. In particular, B01=α0/π0α1/π1=귀무대립 B_{01} = {{ \alpha_{0 } / \pi_{0} } \over { \alpha_{1 } / \pi_{1} }} = {{ \text{귀무} } \over { \text{대립} }} thinking of it this way makes understanding much simpler. In simple terms, if the probability of the null hypothesis is higher when actually calculated with the data, it supports the null hypothesis. There’s no need to think about rejection regions or p-values.

If it said B01=3B_{01} = 3, it means the posterior information supports H0H_{0} to the extent that it is 33 times more than it supports H1H_{1}.

Jeffrey’s Interpretation

Regarding the extent to which the null hypothesis is supported, Jeffrey proposed the following interpretation. From the perspective of supporting H0H_{0}, the Bayes factor is interpreted as follows:

  • 1B0131 \le B_{01} \le 3: Weak evidence
  • 3<B01123 < B_{01} \le 12: Positive evidence
  • 12<B0115012 < B_{01} \le 150: Strong evidence
  • 150B01150 \le B_{01}: Very strong evidence

The advantage of this interpretation is much more flexible compared to the extreme dichotomy of whether ’the p-value exceeds the significance level or not’ of frequentist hypothesis testing. If you frequently use regression analysis, you might have wanted to set the significance level to α=0.05\alpha = 0.05, but the p-value turned out to be p=0.069925p = 0.069925, leading you to discard the regression coefficient. Honestly, as analysts are human, experiencing this can only be frustrating. Hence, one looks for solutions in every possible way, but most end up futile.

In contrast, Bayesian hypothesis testing simply accepts the data as is, whether sufficient or not.

Example

When YB(10,θ)Y \sim B (10, \theta ), to conduct a Bayesian test against H0:θ=12\displaystyle H_{0} : \theta = {{1} \over {2}} vs H1:θ12\displaystyle H_{1} : \theta \ne {{1} \over {2}}. The prior probabilities of H0H_{0} and H1H_{1} are the same, under H1H_{1} θBeta(1,1)\theta \sim \text{Beta} (1,1) and the observation is Y=7Y=7. Calculate the Bayes factor B01B_{01}.

Solution

B01=α0/π0α1/π1=p(yθ0)Θ1p(yθ)g(θ)dθ=p(Y=7θ=12)Θ1p(yθ)dθ=(107)(12)7(112)301(107)θ7(1θ)3dθ=1210101θ81(1θ)41dθ=1210Γ(8+4)Γ(8)Γ(4)=121011!7!3!=121089101123=24325112113=16527=1.2890625 \begin{align*} B_{01} =& {{ \alpha_{0 } / \pi_{0} } \over { \alpha_{1 } / \pi_{1} }} = {{ p ( y \mid \theta_{0} ) } \over { \int_{\Theta_{1}} p ( y \mid \theta ) g ( \theta ) d \theta }} = {{ p ( Y = 7 \mid \theta = {{1} \over {2}} ) } \over { \int_{\Theta_{1}} p ( y \mid \theta ) d \theta }} \\ =& {{ \binom{10}{7} \left( {{1} \over {2}} \right)^{7} \left( 1- {{1} \over {2}} \right)^{3} } \over { \int_{0}^{1} \binom{10}{7} \theta^{7} \left( 1 - \theta \right)^{3} d \theta }} = {{1} \over {2^{10}}} {{1} \over { \int_{0}^{1} \theta^{8-1} (1 - \theta)^{4-1} d \theta }} = {{1} \over {2^{10}}} {{ \Gamma ( 8 + 4 ) } \over { \Gamma ( 8 ) \Gamma ( 4 ) }} \\ =& {{1} \over {2^{10}}} {{ 11! } \over { 7! \cdot 3! }} = {{1} \over {2^{10}}} {{ 8 \cdot 9 \cdot 10 \cdot 11 } \over { 2 \cdot 3 }} = {{ 2^4 \cdot 3^2 \cdot 5 \cdot 11 } \over { 2^{11} \cdot 3 }} = {{ 165 } \over { 2^{7} }} = 1.2890625 \end{align*} Therefore, B01B_{01} is weak evidence supporting the null hypothesis.


  1. 김달호. (2013). R과 WinBUGS를 이용한 베이지안 통계학: p159~161. ↩︎