Dirac Measure and Discrete Probability Distribution Defined by Measure Theory
Overview
In basic probability theory, a probability distribution was either discrete or continuous, and its explanation had to rely somewhat on intuition. However, with the introduction of measure theory, discrete probability distributions can be defined cleanly without any mathematical ambiguity.
Discrete Probability Distribution 1
Assume that a probability space $( \Omega , \mathcal{F} , P)$ is given.
Step 1. When the random variable $X$ takes only one value
When considering only the case of $X = a$, its probability distribution $P_{X}$ is called a Dirac measure $\delta_{a}$.
$$ P_{X} (B) = \delta_{a} (B) := \begin{cases} 1 &, a \in B \\ 0 &, a \notin B \end{cases} $$
As seen, the probability distribution $P_{X}$ only cares about whether $a$ is included in the Borel set $B \in \mathcal{B} ( \mathbb{R} )$ or not. This fundamentally differs from understanding the type of probability distributions by discerning the differences between discrete and continuous. Even if we redefine probability distributions using measure theory, it doesn’t mean we can’t use the terms (absolute) continuous probability distribution or discrete probability distribution. However, those two eventually become the same probability distributions, defined in the same manner but having different properties.
Step 2. When the random variable $X$ takes two values
Consider a random variable that is $X = a$ with probability $p$ and $X = b$ with probability $(1-p)$. $X$ can use the Dirac measure from Step 1. to represent its probability distribution as follows.
$$ P_{X} (B) = p \delta_{a} (B) + (1-p) \delta_{b} (B) $$
To elaborate more easily,
$$ P_{X} (B) = \begin{cases} 1 &, a,b \in B \\ p &, a \in B \land b \notin B \\ 1-p &, a \notin B \land b \in B \\ 0 &, \text{otherwise}\end{cases} $$ From such a development, one could consider the general form of discrete probability distributions.
Step 3. General discrete probability distribution $X$
Given $i \in \mathbb{N}$ where $p_{i} > 0$ and $\sum_{i} p_{i} = 1$, one can consider the following probability distribution $P_{X}$.
$$ P_{X} (B) = \sum_{i \in \mathbb{N}} p_{i} \delta_{a_{i}} (B) $$
It might seem unfamiliar and counterintuitive to define in such a new way, but it is precisely this ’lack of intuitiveness’ that justifies the introduction of measure theory. If one has touched upon probability theory to the extent of involving measure theory, it’s best to become acquainted with these expressions as quickly as possible.
Neither Discrete Nor Continuous Probability Distributions 2
On one hand, the Dirac measure was needed only to define discrete probability distributions, but using the Dirac measure doesn’t solely make it a discrete probability distribution. As mentioned before, even when introducing measure theory, there are still continuous and discrete probability distributions. However, the Dirac measure is nothing more than a rather ordinary Lebesgue measure, so how we use it entirely depends on our freedom. Consider the following example:
Q. Assume cities $A$ and $B$ are 50km apart. One must drive a car that can only go at 100km/h from $A$ to $B$, and the departure time is randomly determined with a uniform distribution between 1 and 2 PM. Of course, if one arrives at $B$ before 2 PM, the car can be parked. Then, what probability distribution would the distance between the car and $B$ follow at exactly 2 PM?
A. If the departure time is before 1:30 PM, one would be able to arrive at $B$ and park in advance. However, if one departs at 1:$t > 30$ minutes, the later departure would result in being further from $B$. Represented as a random variable, it looks like this.
$$ X(t) = \begin{cases} 0 &, t \in [0,30] \\ \left| 50 - 100 {{t} \over {60}} \right| &, t \in (30, 60] \end{cases} $$
Then, the probability distribution can be expressed using the Dirac measure $\delta$ and the uniform measure $m$ as follows.
$$ P_{X} = {{1} \over {2}} \delta_{0} + {{1} \over {2}} \cdot {{1} \over {50}} m_{[ 0, 50]} $$
As seen in the example, $P_{X}$ is neither a discrete nor a continuous probability distribution.