Expectation, Mean, Variance, and Moments in Mathematical Statistics
Definition: Expectation, Mean, and Variance
Let’s assume that we have a given random variable $X$.
If the probability density function $f(x)$ of a continuous random variable $X$ satisfies $\displaystyle \int_{-\infty}^{\infty} |x| f(x) dx < \infty$, then $E(X)$, defined as follows, is called the Expectation of $X$. $$ E(X) := \int_{-\infty}^{\infty} x f(x) dx $$
If the probability mass function $p(x)$ of a discrete random variable $X$ satisfies $\displaystyle \sum_{x} |x| p(x) < \infty$, then $E(X)$, defined as follows, is called the Expectation of $X$. $$ E(X) := \sum_{x} x p(x) $$
If $\mu = E(X)$ exists, it is defined as the Mean of $X$.
If $\sigma^2 = E((X - \mu)^2)$ exists, it is defined as the Variance of $X$.
The Meaning of Abstraction
If we were to reduce statistics to a single, simple expression, it could be seen as the study of “So, what’s the average?” The ‘mean’ is a statistical quantity that is quite intuitive and easy to calculate as a representative value. However, to explain various phenomena in the world, such a simple level is insufficient, and thus, it becomes abstracted in the form of ’expectation’. Expectation is a concept that applies not only to discrete distributions but also to continuous distributions through the idea of partitioning methods. This writing discusses that very ‘abstraction’.
Although mathematical statistics certainly contains material on statistical theory, its nature is closer to a branch of mathematics when considered. Therefore, an effort to understand mathematical statistics with the mindset of a mathematician, like other branches of mathematics, is necessary. One of the tasks of a mathematician is to come up with strict definitions that do not conflict with intuition or theories already presented in the world, turning everything, objects, phenomena, or even things beyond those, into symbols and enabling the study of all those things in place.
The definitions of mean and variance have no intuitive meaning left and simply appear in the form of taking the expectation of a random variable. It’s not of interest how these definitions connect to the mean and variance in reality. Rather, learners who are so accustomed to the concepts of mean and variance that they take them for granted are expected to adapt to the symbols in reverse. Indeed, learners at the level of studying mathematical statistics probably won’t have much trouble accepting such definitions.
With the abstraction of the mean and the establishment of the concept of expectation, scholars discover more possibilities. Beyond the limitations of simple summation or integration, they begin to handle them by introducing basic theories of algebra or analysis. Naturally, the interest of scholars shifts from abstraction to generalization.
Definition: Moments
For a natural number $m$, $E( X^m )$ is defined as the $m$th Moment of $X$.
The Meaning of Generalization
Merely reading the definition of moments reveals that the first moment, $E(X^1)$, is the mean of $X$. A bit further thought reveals that the second moment, $E(X^2)$, can also become the variance through some manipulation.
The first moment is essentially the same concept as the mean, and the second moment corresponds to the variance. Conversely, it can be said that among the moments, the first corresponds to the mean, and the second corresponds to the variance.
Naturally, scholars with intuition would speculate that the third or fourth moments might also provide some important information. [ NOTE: Specifically, these are referred to as Skewness and Kurtosis.] Research direction now reverses, with theories not explaining discovered facts but rather seeking hidden facts through theories.
This methodology can be easily found in natural sciences as well as in statistics. While mathematical statistics leans more towards mathematics than statistics, as it approaches the essence of the existence of the discipline, it regains the appearance of statistics. The term moment, though used in physics and other fields, doesn’t necessarily need to carry a specific meaning in statistics. It suffices to know it as a term used when explaining the supporting theories of mathematical statistics.
On the other hand, one might wonder why it’s necessary to take the absolute value in $\displaystyle \int_{-\infty}^{\infty} |x| f(x) dx < \infty$, the existence condition of the expectation, $x$. Even if guaranteeing existence and calculating specific values differ, there seems to be no need to use a different formula for the expectation $E(X)$ of $X$.
This might be somewhat clarified by observing the following theorem. If one considers not just the simplest discussion of $X$ but the generalization to $g(X)$, then $E(X)$ could indeed be considered a special case as defined for the identity function $g(x) = x$.
Theorem
For the random variable $X$, let’s say $Y$ appears in the form of $Y := g(X)$ for some function $g$.
- [1]: If $X$ is a continuous random variable with probability density function $f_{X}$ and satisfies $\displaystyle \int_{-\infty}^{\infty} |g(x)| f_{X} (x) dx < \infty$, then $$ E (Y) = \int_{-\infty}^{\infty} g(x) f_{X} (x) dx $$
- [2]: If $X$ is a discrete random variable with probability mass function $p_{X}$ and satisfies $\displaystyle \sum_{x} |g(x)| p_{X} (x) < \infty$, then $$ E (Y) = \sum_{x} g(x) p_{X} (x) $$