logo

Hellinger Distance of Probability Distributions 📂Probability Theory

Hellinger Distance of Probability Distributions

Definition

The following distance function defined on probability distributions themselves is called the Hellinger distance.

Discrete 1

Let p,qp, q be a probability mass function. The Hellinger distance of p,qp, q is defined as: H(p,q):=12k(pkqk)2 H \left( p , q \right) := \sqrt{ \frac{1}{2} \sum_{k} \left( \sqrt{p_{k}} - \sqrt{q_{k}} \right)^{2} }

Continuous 2

Let f,gf, g be a probability density function. The Hellinger distance of f,gf, g is defined as: H2(f,g):=12R(f(x)g(x))2dx=1Rf(x)g(x)dx \begin{align*} & H^{2} \left( f , g \right) \\ :=& {\frac{ 1 }{ 2 }} \int_{\mathbb{R}} \left( \sqrt{f(x)} - \sqrt{g(x)} \right)^{2} dx \\ =& 1 - \int_{\mathbb{R}} \sqrt{f(x)g(x)} dx \end{align*}

Explanation

The Hellinger distance is, by definition, a distance function that directly compares probability mass functions or probability density functions. It is bounded by [0,1][0,1], being 00 when they are identical and 11 when they are completely different. While the Kullback-Leibler divergence is widely used for comparing probability distributions, the Hellinger distance has the distinguishing feature of being a proper distance function, allowing the discussion of metric spaces.

See Also


  1. Gingold, J.A., Coakley, E.S., Su, J. et al. Distribution Analyzer, a methodology for identifying and clustering outlier conditions from single-cell distributions, and its application to a Nanog reporter RNAi screen. BMC Bioinformatics 16, 225 (2015). https://doi.org/10.1186/s12859-015-0636-7 ↩︎

  2. Wibisono. (2024). Optimal score estimation via empirical Bayes smoothing. https://doi.org/10.48550/arXiv.2402.07747 ↩︎