Conditional Expectation of Random Variables Defined by Measure Theory 📂Probability Theory

Conditional Expectation of Random Variables Defined by Measure Theory

Definition

Let’s assume a probability space $( \Omega , \mathcal{F} , P)$ is given.

If $\mathcal{G}$ is a sub sigma-field of $\mathcal{F}$ and the random variable $X \in \mathcal{L}^{1} ( \Omega )$ is integrable, for all $A \in \mathcal{G}$, $$ \int_{A} Y d P = \int_{A} X d P $$ a $\mathcal{G}$-measurable random variable $Y$ uniquely exists satisfying the above, then $Y := E ( X | \mathcal{G} )$ is defined as the conditional expectation of $X$ given $\mathcal{G}$.

Even though I’d like to say you can ignore the term probability space if you have not encountered measure theory yet, understanding the content of this post is nearly impossible without any knowledge of measure theory at all.
That $\mathcal{G}$ is a sub sigma-field of $\mathcal{F}$ means both are sigma-fields of $\Omega$, with $\mathcal{G} \subset \mathcal{F}$ being true. That $Y$ is a $\mathcal{G}$-measurable function means for every Borel set $B \in \mathcal{B}(\mathbb{R})$, $Y^{-1} (B) \in \mathcal{G}$ holds true.

Explanation

In the mathematical definition, $\mathcal{G}$ becomes a probability space $( \Omega , \mathcal{G} , P)$, which is not as vast as the original probability space $( \Omega , \mathcal{F} , P)$ but has a bit more information given. Therefore, $$ \int_{A} X d P = \int_{A} Y d P = \int_{A} E ( X | \mathcal{G} ) d P $$ indicates that the calculation remains the same within that reduced space, which means we’ve successfully pulled down the probability $P$ from $\mathcal{F}$ to $\mathcal{G}$ while keeping its properties intact.

Also, from the format of the definition, since $E ( X | \mathcal{G} )$ exists as a $\mathcal{G}$-measurable random variable, it should be naturally accepted that the expected value is a variable random, and measurable with respect to the given sigma field.

While the definition may not be intuitively difficult to grasp, its expression might feel somewhat unfamiliar. The $\sigma (X) := \left\{ X^{-1} (B) : B \in \mathcal{B}(\mathbb{R}) \right\}$ regarding the random variable $X$ becomes the smallest sigma field $\sigma (X) \subset \mathcal{F}$ generated by $X$, which is often stated in familiar terms as: $$ E(Y|X) = E \left( Y | \sigma (X) \right) $$ However, although this can be stated, getting accustomed to this approach is much easier if one intends to continue studying measure-based probability theory. Considering it, $E(Y|X)$ was conceptually intuitive but was a painfully complex notation for handling formulas or making direct calculations. It’s time to let it go without regret.

The existence of the conditional expectation is warranted by the Radon-Nikodym Theorem. Understanding the theorem is crucial, but the proof itself is not difficult.

Proof

Case 1. $X \ge 0$

$$ P_{\mathcal{G}}(A) := \int_{A} X d P $$ If defined as $P_{\mathcal{G}}$ for all $A \in \mathcal{G}$, $P_{\mathcal{G}}$ becomes a measure on $\mathcal{G}$, and $P_{G} \ll P$ holds.

If the two sigma-finite measures $\nu$, $\mu$ on the measurable space $( \Omega , \mathcal{F} )$ satisfy $\nu \ll \mu$ according to the Radon-Nikodym Theorem, for all $A \in \mathcal{F}$, $\mu$-almost everywhere $h \ge 0$ and $$ \nu (A) = \int_{A} h d \mu $$ a $\mathcal{F}$-measurable function $f$ uniquely exists in accordance to $\mu$.

Following the theorem, if we define $\nu = P_{\mathcal{G}}$ and $\mu = P$, for all $A \in \mathcal{G}$, $$ P_{\mathcal{G}} (A) = \int_{A} Y d P $$ a $Y \ge 0$ uniquely exists, satisfying the above. According to the initial definition of $P_{\mathcal{G}}$, $Y$ becomes the conditional expectation of $X$ given $\mathcal{G}$.

Case 2. General case

You can decompose $X$ into two parts $X^{+} , X^{-} \ge 0$ and use the same method as in Case 1..

■