Probability Variables and Probability Distributions Defined by Measure Theory
Definition 1
Let’s assume a Probability Space $( \Omega , \mathcal{F} , P)$ is given.
- A function $X : \Omega \to \mathbb{R}$ that satisfies $X^{-1} (B) \in \mathcal{F}$ for every Borel Set $B \in \mathcal{B} (\mathbb{R})$ is called a Random Variable.
- $\mathcal{F}_{X}$ defined as follows is called the Sigma Field generated by $X$. $$ \mathcal{F}_{X} := X^{-1} ( \mathcal{B} ) = \sigma (X) = \left\{ X^{-1} (B) \in \Omega : B \in \mathcal{B}( \mathbb{R} ) \right\} $$
- Measure $P_{X}$ defined as follows is called the Probability Distribution of $X$. $$ P_{X} (B) := P ( X^{-1} (B) ) $$
- If you haven’t encountered measure theory yet, you can ignore the term probability space.
Explanation
Just like the probability space, a random variable can also be rigorously defined within Measure Theory.
- Saying $X^{-1} (B) \in \mathcal{F}$ means that $X$ maps elements of $\Omega$ to real numbers allowing the use of relations like $P(a \le X \le b)$ while ensuring that the pre-images of Borel sets belong to the Sigma Field, thus limiting what is considered an Event to reasonable sets only. At first glance, it might seem overly abstract, but paradoxically, its goal is to counter excessive abstraction. According to the definition, a random variable $X$ is not only a real function but also a Measurable Function, and if $\Omega = \mathbb{R}$, then $\mathcal{F} = \mathcal{B} \left( \mathbb{R} \right)$, making it a Borel function $X : \mathbb{R} \to \mathbb{R}$. Basic theorems in mathematical statistics are sufficient at this level. Beyond this, generalization to multivariate random variables is simply done by defining $X : \Omega \to \mathbb{R}^{p}$ that satisfies $X^{-1} (B) \in \mathcal{F}$ for every Borel set $B \in \mathcal{B} (\mathbb{R}^{p})$. Naturally, $X$ can be expressed as a vector $X = ( X_{1}, \cdots , X_{p})$ for each random variable $X_{i} : \Omega \to \mathbb{R}$ and is called a Probability Vector. When this leads to a sequence of random variables, it is called a Stochastic Process, and more generally, a Random Element.
- For a Sigma Field $\mathcal{G}$, if $Y^{-1} ( \mathcal{B} ) \in \mathcal{G}$, then $Y$ is $\mathcal{G}$-measurable, and naturally, according to the definition of $\mathcal{F}_{X}$, $X$ is $\mathcal{F}_{X}$-measurable.
- It might be confusing with all the definitions, but if you think about it step by step, it’s not difficult at all. Since $X^{-1} (B) \in \mathcal{F}$, you can think of it as if reversing the function, which leads to $X^{-1} : \mathcal{B} (\mathbb{R}) \to \mathcal{F}$. This way, $P_{X} : = ( P \circ X^{-1} )$ can be understood as $$ P_{X} : \mathcal{B} (\mathbb{R}) \to \mathcal{F} \to [0,1] $$ and is merely a composite function that maps any values between $0$ and $1$ for a given Borel set $B$. For example, $[-3,-2]$ is naturally a Borel set of $\mathbb{R}$, and depending on how the random variable $Y$ is defined, it enables calculations like $P_{Y} ( [-3,-2] ) = 0.7$.
See Also
Capinski. (1999). Measure, Integral and Probability: p66~68. ↩︎