logo

Another Definition of Mean and Variance 📂Probability Theory

Another Definition of Mean and Variance

Definition

Euclidean Space R\mathbb{R} 1

For a random variable X:ΩRX :\Omega \to \mathbb{R}, the infimum of the squared deviations’ expectation σ2(X)R\sigma^{2} (X) \in \mathbb{R} is defined as the variance of XX. σ2(X):=infaRE[(Xa)2] \sigma^{2} \left( X \right) := \inf_{a \in \mathbb{R}} E \left[ \left( X - a \right)^{2} \right] The value that minimizes the expectation of squared deviations μ(X)R\mu (X) \in \mathbb{R} is defined as the mean. μ(X):=arg minaRE[(Xa)2] \mu \left( X \right) := \argmin_{a \in \mathbb{R}} E \left[ \left( X - a \right)^{2} \right]

General Space R\mathcal{R}

For a random element X:ΩRX : \Omega \to \mathcal{R}, the infimum of the expectation of squared deviations σ2(X)R\sigma^{2} (X) \in \mathbb{R} is defined as the variance of XX, and the value that minimizes the expectation of squared deviations μ(X)R\mu (X) \in \mathcal{R} is defined as the mean.

Explanation

Traditionally, in conventional probability theory, the mean is initially defined as the first moment, and the variance is then defined as the sum of squared deviations from the mean. Whether by coincidence or not, the mean has the property of minimizing the sum of squared deviations, which can indeed be proven. However, in this post, variance is defined first, and what minimizes the expectation of squared deviations is defined as the mean, which seems more natural in the sense of least squares, especially when assuming manifolds rather than Rn\mathbb{R}^{n}.

According to the definitions of variance and mean for X:ΩRX : \Omega \to \mathcal{R}, the mean may no longer be unique. Consider, for example, a Fisher distribution on a sphere S2S^{2}. If you denote a point μ\mu directly opposite to another point on the sphere as μ- \mu, then when XvMF3(μ,κ)X \sim \text{vMF}_{3} \left( \mu , \kappa \right) and YvMF3(μ,κ)Y \sim \text{vMF}_{3} \left( -\mu , \kappa \right), X+YX + Y can serve as the mean for both μ\mu and μ-\mu without any problem.

What is intriguing in the previous example is that the meaning of the word ‘mean’ itself has already been diluted. Outside of R\mathbb{R}, in other realms, the mean might not be just a mean but could manifest as a mean vector, mean matrix, mean graph, and so on. However, the variance, regardless of what R\mathcal{R} the random element is defined in, will always fundamentally be a concept that represents dispersion and will always assume a real value. From this perspective, variance is essential and foundational, and it’s natural for it to be defined before the mean.


  1. Gerald B. Folland, Real Analysis: Modern Techniques and Their Applications (2nd Edition, 1999): p314 ↩︎