logo

Pearson Chi-Square test statistic 📂Statistical Test

Pearson Chi-Square test statistic

Definition 1

Consider a multinomial experiment where kk categories are drawn each with a probability of pj>0p_{j} > 0, and we obtain categorical data from nn independent trials. The frequency of data belonging to the jj-th category OjO_{j} is termed the observed cell count, while the expected value under the null hypothesis of hypothesis testing EjE_{j} is called the expected cell count. The test statistic X2:=j=1k(OjEj)2Ej \mathcal{X}^{2} := \sum_{j=1}^{k} {{ \left( O_{j} - E_{j} \right)^{2} } \over { E_{j} }} is referred to as the Pearson Chi-square test statistic.

Explanation

Hypothesis Testing

X2\mathcal{X}^{2} is a representative test statistic that freshmen encounter, often striking fear and awe in those who are only familiar with normal distribution or binomial distribution. For inexperienced individuals, understanding the chi-square distribution is challenging unless they have developed an intuition about data and statistical analysis. Therefore, a simplified explanation based on the formula is provided.

  1. In most cases, a large X2\mathcal{X}^{2} indicates a discrepancy between actual data and theoretical expectations. Observing the numerator of the formula, (OjEj)20\left( O_{j} - E_{j} \right)^{2} \ge 0 is minimized when exactly Oj=EjO_{j} = E_{j}, i.e., when the observed data precisely matches the known theoretical probability pjp_{j}. The greater the discrepancy in these values, the larger the numerator can grow indefinitely.
  2. Consequently, as the data increasingly deviates from the null hypothesis H0H_{0}, the value of X2\mathcal{X}^{2} increases, typically leading to the rejection of the null hypothesis when X2\mathcal{X}^{2} exceeds χ1α2\chi^{2}_{1-\alpha}, in a right-tailed statistical test.
  3. In simple terms, a large X2\mathcal{X}^{2} signifies “something is very wrong.” The chi-square distribution is used to determine the extent of the deviation or dispersion.

The Pearson Chi-square test statistic for categorical data is typically used for the following purposes:

Theoretical Basis

If you’re reading further, you’re likely beyond the freshman level.

It is known via Student’s theorem that the square of a residual assumed to follow a normal distribution proportionally follows a chi-square distribution. However, even for undergraduates familiar with mathematical statistics, the structure of X2\mathcal{X}^{2} might seem quite awkward. At first glance, it seems plausible, but the absence of an assumption that deviations follow a normal distribution makes it seem like an empirical statistic. Of course, statistics do not work in a haphazard way, and the properly proven Pearson’s theorem ensures the chi-square nature of X2\mathcal{X}^{2}.

Pearson’s Theorem: Given a sample size of nNn \in \mathbb{N} and kNk \in \mathbb{N} categories, let the random vector (N1,,Nk)\left( N_{1} , \cdots , N_{k} \right) follow the multinomial distribution Mk(n;p)M_{k} \left( n ; \mathbf{p} \right). Then, when nn \to \infty, the following statistic SS converges in distribution to a chi-square distribution χ2(k1)\chi^{2} \left( k - 1 \right). S=j=1k(Njnpj)2npjDχ2(k1) S = \sum_{j=1}^{k} {{ \left( N_{j} - n p_{j} \right)^{2} } \over { n p_{j} }} \overset{D}{\to} \chi^{2} \left( k-1 \right)

The multinomial experiment introduced in the definition assumes that our data adheres to a multinomial distribution. According to Pearson’s theorem, if the sample is sufficiently large, it approximates a chi-square distribution with degrees of freedom (k1)(k-1) derived from subtracting 11 from the number of categories kk. Although the proof of Pearson’s theorem is not straightforward, undergraduates can effectively utilize X2\mathcal{X}^{2} even without comprehensive theoretical knowledge. However, those who decide to pursue graduate studies are encouraged to dedicate time to understand and prove it independently.


  1. Mendenhall. (2012). Introduction to Probability and Statistics (13th Edition): p596. ↩︎