Kruskal-Wallis H Test
Hypothesis testing 1
Under Experimental design, suppose there are $k$ treatments and from each treatment we obtained $n_{j}$ observations for a total of $n = n_{1} + \cdots + n_{k}$ samples. Assume that the samples from the $j = 1 , \cdots , k$-th treatment were independently and randomly sampled from the same location family, and denote the population median of the $j$-th population by $\theta_{j}$. The following hypothesis test about $\theta_{1} , \cdots , \theta_{k}$ is called the Kruskal–Wallis $H$ test.
- $H_{0}$: $\theta_{1} = \cdots = \theta_{k}$
- $H_{1}$: At least one $\theta_{j}$ is different from the others.
Test statistic
The test statistic is based on the rank-sum $R_{j}$ of the sample from the $j$-th population and is given by $$ H = {\frac{ 12 }{ n \left( n + 1 \right) }} \sum_{j=1}^{k} {\frac{ R_{j}^{2} }{ n_{j} }} - 3 \left( n + 1 \right) $$ This test statistic asymptotically follows a chi-square distribution $\chi^{2} \left( k - 1 \right)$ with $k-1$ degrees of freedom when each $n_{j}$ is sufficiently large.
Explanation
The Kruskal–Wallis test is a nonparametric method corresponding to one-way analysis of variance (ANOVA) among parametric methods. It generalizes the Wilcoxon signed-rank test (Wilcoxon signed-rank test) in that it compares $k$ populations simultaneously. For a significance level $\alpha$, compare the test statistic $H > \chi^{2}_{1-\alpha} (k-1)$ with the lower bound of the rejection region $\chi^{2}_{1-\alpha} (k-1)$; if $H > \chi^{2}_{1-\alpha} (k-1)$ holds, reject the null hypothesis and conclude that at least one population differs from the others.
At first glance the formula for the test statistic looks excessively messy; understanding its derivation requires substantial background. Below I present a rigorous derivation, but intuitively, without assuming the null hypothesis, consider the sample means of ranks $\overline{R}_{j}$ from each population: $$ \sum_{j=1}^{k} \left( \overline{R}_{j} - {\frac{ n \left( n + 1 \right) }{ 2 }} \right)^{2} $$ One starts from the idea that the following statistic can be large. If all populations come from the same distribution, there should be no large differences in the rank-sums, and connecting this idea to the chi-square distribution is the core of the Kruskal–Wallis test.
Derivation
For reference, I searched all available sources but could not find a mathematically clean derivation, so I proved almost every part myself; as far as I know, there is no simpler and friendlier document in the world. I hope this is helpful to readers of this note.
Without loss of generality (without loss of generality), assume there are no ties in ranks (tie).
Part 1. Definition of $H$
Let the mean of ranks computed from the sample of the $j = 1 , \cdots , k$-th population be denoted by $\overline{R}_{j}$, and let the total sum of ranks be denoted by $\overline{R}$. If $n_{j}$ is not too small, $\overline{R}_{j}$ is approximately normally distributed by the central limit theorem.
Central limit theorem (central limit theorem): If $\left\{ X_{k} \right\}_{k=1}^{n}$ are iid (iid) random variables with distribution $\left( \mu, \sigma^2 \right) $, then as $n \to \infty$, $$ \sqrt{n} {{ \overline{X}_n - \mu } \over {\sigma}} \overset{D}{\to} N (0,1) $$
$$ \begin{align*} \overline{R}_{j} :=& {\frac{ R_{j} }{ n_{j} }} \\ \overline{R} :=& {\frac{ n + 1 }{ 2 }} \end{align*} $$
Weighted sum of squared deviations when the population mean is known (모평균을 알 때의 편차제곱의 가중합): Under Experimental design with $k$ treatments and $n_{j}$ observations per treatment for a total of $n = n_{1} + \cdots + n_{k}$ samples, suppose the sample from the $j = 1 , \cdots , k$-th treatment are independent and random and follow normal distributions $N \left( \mu_{j} , \sigma_{j}^{2} \right)$, and assume all population variances are equal to $\sigma^{2} = \sigma_{1}^{2} = \cdots = \sigma_{k}^{2}$. The following statistic defined as a weighted sum of squared deviations has a chi-square distribution with $(k-1)$ degrees of freedom. $$ \sum_{j=1}^{k} \frac{ \left( \bar{x}_{j} - \bar{x} \right)^{2} }{ \sigma^{2} / n_{j} } \sim \chi^{2} \left( k - 1 \right) $$ This holds even if the samples themselves are not normal, provided $\left( \bar{x}_{j} - \bar{x} \right)$ is normally distributed.
$$ \sum_{j=1}^{k} {\frac{ \left( \overline{R}_{j} - \overline{R} \right)^{2} }{ \sigma^{2} / n_{j} }} \sim \chi^{2} \left( k - 1 \right) $$ If the null hypothesis is true, the statistic obtained as above follows a chi-square distribution. In practice, however, we will use a corrected $H$ with a multiplicative correction factor $(n-1)/n$ as follows. $$ H = {\frac{ n-1 }{ n }} \cdot \sum_{j=1}^{k} {\frac{ \left( \overline{R}_{j} - \overline{R} \right)^{2} }{ \sigma^{2} / n_{j} }} $$ A justification for this correction is provided below, so there is no need for concern.
Part 2. Expansion of $H$
Mean and variance of ranks (랭크의 평균과 분산): Let $n$ continuous random variables $X_{1} , \cdots , X_{n}$ be given iid (iid). Denote the rank of each sample by $R \left( X_{1} \right) , \cdots , R \left( X_{n} \right)$. The distribution of ranks is a discrete uniform distribution (discrete uniform distribution) $U (1, n)$, and the expectation and variance of $R$ are $$ \begin{align*} E \left( R \right) =& {\frac{ n + 1 }{ 2 }} \\ \Var \left( R \right) =& {\frac{ n^{2} - 1 }{ 12 }} \end{align*} $$
Since $\overline{R} = (n+1)/2$, simplify both sides of $H$ slightly and expand to get $$ \begin{align*} & {\frac{ n \sigma^{2} }{ n - 1 }} H \\ =& \sum_{j=1}^{k} n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2} \\ =& \sum_{j=1}^{k} \left[ R_{j} \overline{R}_{j} - 2 R_{j} \overline{R} + n_{j} \overline{R}^{2} \right] \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - 2 \overline{R} \sum_{j=1}^{k} R_{j} + \overline{R}^{2} \sum_{j=1}^{k} n_{j} \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - 2 \overline{R} {\frac{ n(n+1) }{ 2 }} + \overline{R}^{2} n \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n \overline{R} \left[ (n+1) - \overline{R} \right] \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n \overline{R} \left[ 2 \overline{R} - \overline{R} \right] \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n \overline{R}^{2} \\ =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n {\frac{ (n+1)^{2} }{ 4 }} \end{align*} $$ Because $\sigma^{2} = (n^{2}-1)/12$, rearranging $H$ yields $$ \begin{align*} {\frac{ n \sigma^{2} }{ n - 1 }} H =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n {\frac{ (n+1)^{2} }{ 4 }} \\ \implies {\frac{ n }{ n - 1 }} {\frac{ (n-1)(n+1) }{ 12 }} H =& \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - n {\frac{ (n+1)^{2} }{ 4 }} \\ \implies H =& {\frac{ 12 }{ n (n+1) }} \sum_{j=1}^{k} {\frac{ \overline{R}_{j}^{2} }{ n_{j} }} - 3 (n+1) \end{align*} $$
Part 3. Justification of the correction
Originally $\sum n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2} / \sigma^{2}$ does follow a chi-square distribution, but in practice it is slightly inflated compared to theory, so we multiply by $(n-1) / n$ to obtain the corrected $H$. This correction is not chosen arbitrarily; it is determined by the method of moments so that for sufficiently large $n$ the corrected statistic $(n-1) / n \approx 1$ and the actual expectation of $H$ equals $(k-1)$.
Finite population correction factor (finite population correction factor): Given a random sample $X_{1} , \cdots , X_{N}$ from a finite population with population variance $\sigma^{2}$, the variance of the overall sample mean $\overline{X}_{N}$ is $\sigma^{2} / N$. If a sample of $n \le N$ units is drawn without replacement, the sample mean has variance given below; the factor $\text{FPC} = \left( N - n \right) / \left( N - 1 \right)$ multiplied to the squared standard error $\text{s.e.} \left( \overline{X}_{n} \right) = \sigma^{2} / n$ is called the finite population correction factor. $$ \Var \left( \overline{X}_{n} \right) = \text{s.e.} \left( \overline{X}_{n} \right) \cdot \text{FPC} = {\frac{ \sigma^{2} }{ n }} \cdot {\frac{ N - n }{ N - 1 }} $$
The expectation of $\sum n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2}$ is $$ \begin{align*} & E \left( \sum_{j=1}^{k} n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2} \right) \\ =& E \left( \sum_{j=1}^{k} n_{j} \left[ \left( \overline{R}_{j}^{2} - 2 \overline{R}_{j} \overline{R} + \overline{R}^{2} \right) \right] \right) \\ =& \sum_{j=1}^{k} n_{j} \left[ E \left( \overline{R}_{j}^{2} \right) - 2 \overline{R} E \left( \overline{R}_{j} \right) + \overline{R}^{2} \right] \\ =& \sum_{j=1}^{k} n_{j} \left[ E \left( \overline{R}_{j}^{2} \right) - E \left( \overline{R}_{j} \right)^{2} \right] \\ =& \sum_{j=1}^{k} n_{j} \Var \left( \overline{R}_{j} \right) \\ =& \sum_{j=1}^{k} n_{j} {\frac{ \sigma^{2} }{ n_{j} }} {\frac{ n - n_{j} }{ n - 1 }} \\ =& \sum_{j=1}^{k} {\frac{ n^{2} - 1 }{ 12 }} {\frac{ n - n_{j} }{ n - 1 }} \\ =& {\frac{ n + 1 }{ 12 }} \sum_{j=1}^{k} \left( n - n_{j} \right) \\ =& {\frac{ n + 1 }{ 12 }} \left( n k - n \right) \end{align*} $$ Rewriting this as the expectation of $\sum n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2} / \sigma^{2}$ gives $$ {\frac{ 1 }{ \sigma^{2} }} E \left( \sum n_{j} \left( \overline{R}_{j} - \overline{R} \right)^{2} \right) = {\frac{ 12 }{ n^{2} - 1 }} {\frac{ n + 1 }{ 12 }} n \left( k - 1 \right) = {\frac{ n }{ n - 1 }} \left( k - 1 \right) $$
Mean and variance of the chi-square distribution (카이제곱 분포의 평균과 분산): If $X \sim \chi^{2} (r)$, then $$ \begin{align*} E(X) =& r \\ \Var (X) =& 2r \end{align*} $$
For sufficiently large samples, the statistic $H$ that follows the chi-square distribution $\chi^{2} (k-1)$ has expectation exactly equal to $(k-1)$ as follows. $$ E \left( H \right) = E \left( {\frac{ n-1 }{ n }} \cdot \sum_{j=1}^{k} {\frac{ \left( \overline{R}_{j} - \overline{R} \right)^{2} }{ \sigma^{2} / n_{j} }} \right) = k -1 $$
■
See also
Experimental design | Parametric methods | Nonparametric methods |
---|---|---|
Completely randomized design | One-way ANOVA | Kruskal–Wallis $H$ test |
Randomized block design | Two-way ANOVA | Friedman $F_{r}$ test |
Kruskal, W. H., & Wallis, W. A. (1952). Use of Ranks in One-Criterion Variance Analysis. Journal of the American Statistical Association, 47(260), 583–621. https://doi.org/10.2307/2280779 https://medstatistic.ru/articles/Kruskal%20and%20Wallis%201952.pdf ↩︎