Smoothing Properties of Conditional Expectation 📂Probability Theory

Smoothing Properties of Conditional Expectation

Theorem

Given a probability space $( \Omega , \mathcal{F} , P)$ and a sub-sigma field $\mathcal{G}, \mathcal{G} ' \subset \mathcal{F}$, assume $X$ and $Y$ are random variables.

[1]: If $X$ is $\mathcal{G}$-measurable $$ E(XY | \mathcal{G}) = X E (Y | \mathcal{G}) \text{ a.s.} $$
[2]: If $\mathcal{G} ' \subset \mathcal{G}$ then $$ \begin{align*} E (X | \mathcal{G} ') =& E \left( E ( X | \mathcal{G}) | \mathcal{G} ' \right) \\ =& E \left( E ( X | \mathcal{G} ') | \mathcal{G} \right) \end{align*} $$

$\mathcal{G}$ being a sub-sigma field of $\mathcal{F}$ means both are sigma fields of $\Omega$, but $\mathcal{G} \subset \mathcal{F}$.
$X$ being a $\mathcal{G}$-measurable function means for every Borel set $B \in \mathcal{B}(\mathbb{R})$, we have $X^{-1} (B) \in \mathcal{G}$.

Description

When dealing with conditional expectations, a sigma field can be viewed as ‘information’ about a random variable. Especially, the smoothing property should be understood more intuitively than being obsessed over a mathematical proof:

[1]: The fact that $X$, not being a scalar, can pass through $E$ is not only astonishing but also implies it can conveniently be used anywhere. That $X$ is $\mathcal{G}$-measurable means the sigma field $\mathcal{G}$ has all the information about $X$. Since it already knows $X$ itself, there’s no need to calculate its expectation; just pass through $E$. Mathematically, $X$ is not a scalar, but at least when $\mathcal{G}$ is given, $X$ becomes a determined value—a scalar.
[2]: That $\mathcal{G}$ is a sub-sigma field of $\mathcal{F}$ can be understood as meaning $\mathcal{G}$ has less information than $\mathcal{F}$. Looking at the formulas, it’s seen that regardless of the order in which expectations are taken, the result will be that of the one with less information. To interpret this intuitively:
- $E (X | \mathcal{G} ') = E \left( E ( X | \mathcal{G}) | \mathcal{G} ' \right)$: Even if $\mathcal{G}$ provides a lot of information about $X$, the lack of information from $\mathcal{G} ' $ results in obtaining an expected value at the level of $\mathcal{G} ' $.
- $E (X | \mathcal{G} ') = E \left( E ( X | \mathcal{G} ') | \mathcal{G} \right)$: $\mathcal{G} ' \subset \mathcal{G}$ essentially means that whatever information $\mathcal{G} ' $ knows about $X$ is already known by $\mathcal{G}$, hence, the expectation obtained is at the level of $\mathcal{G} ' $.

Proof

[1]

Strategy: Start with indicator functions, generalize to simple functions, and use the trick of representing arbitrary functions with non-negative functions to push towards the case of positives.

Part 1. $M \in \mathcal{G}$, $X = \mathbb{1}_{M}$

For all $A \in \mathcal{G}$, $$ \begin{align*} \int_{A} E ( XY | \mathcal{G} ) dP =& \int_{A} XY dP \\ =& \int_{A} \mathbb{1}_{M} Y dP \\ =& \int_{A \cap M} Y dP \\ =& \int_{A \cap M} E(Y | \mathcal{G}) dP \\ =& \int_{A} \mathbb{1}_{M} E(Y | \mathcal{G}) dP \\ =& \int_{A} X E(Y | \mathcal{G}) dP \end{align*} $$ Since $\displaystyle \forall A \in \mathcal{F}, \int_{A} f dm = 0 \iff f = 0 \text{ a.e.}$, $$ E ( XY | \mathcal{G} ) = X E(Y | \mathcal{G}) \text{ a.s.} $$

Part 2. $M \in \mathcal{G}$, $\displaystyle X = \sum_{i=1}^{n} a_{i} \mathbb{1}_{M_{i}}$

$$ \begin{align*} E(XY | \mathcal{G} ) =& E( \sum_{i=1}^{n} a_{i} \mathbb{1}_{M_{i}} Y | \mathcal{G} ) \\ =& \sum_{i=1}^{n} a_{i} E( \mathbb{1}_{M_{i}} Y | \mathcal{G} ) \end{align*} $$ Here, by means of Part 1., since $E( \mathbb{1}_{M_{i}} Y | \mathcal{G} ) = \mathbb{1}_{M_{i}} E( Y | \mathcal{G} )$, $$ \begin{align*} E(XY | \mathcal{G} ) =& \sum_{i=1}^{n} a_{i} E( \mathbb{1}_{M_{i}} Y | \mathcal{G} ) \\ =& \sum_{i=1}^{n} a_{i} \mathbb{1}_{M_{i}} E( Y | \mathcal{G} ) \\ =& X E(Y | \mathcal{G}) \text{ a.s.} \end{align*} $$

Part 3. $X \ge 0$, $Y \ge 0$

Define a sequence of simple functions $\left\{ X_{n} \right\}_{n \in \mathbb{N}}$ satisfying $X_{n} \nearrow X$ for $X$ as follows. $$ X_{n} := \sum_{k=1}^{n 2^n } {{k-1} \over {2^n}} \mathbb{1}_{ \left( {{k-1} \over {2^n}} \le X < {{k} \over {2^n}} \right)} $$ Then $X_{n}$ is also $\mathcal{G}$-measurable, and $X_{n} Y \nearrow XY$ applies. Since $X_{n}$ can pass through $E$ according to Part 2, by the conditional monotone convergence theorem, $$ \begin{align*} E(XY | \mathcal{G} ) =& E \left( \lim_{n \to \infty} X_{n} Y | \mathcal{G} \right) \\ =& \lim_{n \to \infty} E \left( X_{n} Y | \mathcal{G} \right) \\ =& \lim_{n \to \infty} X_{n} E \left( Y | \mathcal{G} \right) \\ =& X E(Y | \mathcal{G}) \text{ a.s.} \end{align*} $$

Part 4. $X \ge 0$

Assume $Y:= Y^{+} - Y^{-}$ then, according to Part 3, $$ \begin{align*} E(XY | \mathcal{G} ) =& E(XY^{+} | \mathcal{G} ) - E(XY^{-} | \mathcal{G} ) \\ =& XE(Y^{+} | \mathcal{G} ) - XE(Y^{-} | \mathcal{G} ) \\ =& X E(Y | \mathcal{G}) \text{ a.s.} \end{align*} $$

–

Part 5. Others

Assume $X := X^{+} - X^{-}$ then, according to Part 4, $$ \begin{align*} E(XY | \mathcal{G} ) =& E(X^{+}Y | \mathcal{G} ) - E(X^{-}Y | \mathcal{G} ) \\ =& X^{+}E(Y | \mathcal{G} ) - X^{-}E(Y | \mathcal{G} ) \\ =& X E(Y | \mathcal{G}) \text{ a.s.} \end{align*} $$

■

[2]

Part 1. $E (X | \mathcal{G} ') = E \left( E ( X | \mathcal{G}) | \mathcal{G} ' \right)$

For all $A \in \mathcal{G} ' $, $$ \begin{align*} \int_{A} E (X | \mathcal{G} ') dP =& \int_{A} X dP \\ =& \int_{A} E(X | \mathcal{G}) dP \\ =& \int_{A} E \left( E(X | \mathcal{G}) | \mathcal{G} ' \right) dP \end{align*} $$ Since $\displaystyle \forall A \in \mathcal{F}, \int_{A} f dm = 0 \iff f = 0 \text{ a.e.}$, $$ E (X | \mathcal{G} ') = E \left( E ( X | \mathcal{G}) | \mathcal{G} ' \right) $$

Part 2. $E (X | \mathcal{G} ') = E \left( E ( X | \mathcal{G} ') | \mathcal{G} \right)$

Since $\mathcal{G} ' \subset \mathcal{G}$, by [1] $$ \begin{align*} E (X | \mathcal{G} ') =& E (X | \mathcal{G} ') \cdot E (1 | \mathcal{G}) \\ =& E \left( E ( X | \mathcal{G} ') \cdot 1 | \mathcal{G} \right) \end{align*} $$

■