logo

Proof of the Continuity Mapping Theorem 📂Probability Theory

Proof of the Continuity Mapping Theorem

Theorem 1

The following is a measure-theoretic description of the continuous mapping theorem.

For metric spaces (S,d)\left( S , d \right) and (S,d)\left( S' , d’ \right), let us say g:SSg : S \to S' is continuous from CgSC_{g} \subset S. For a random element XX in SS, concerning a sequence of random elements converging to XX in {Xn}nN\left\{ X_{n} \right\}_{n \in \mathbb{N}}, the following holds if P(XCg)=1P \left( X \in C_{g} \right) = 1. XnDX    g(Xn)Dg(X)XnPX    g(Xn)Pg(X)Xna.s.X    g(Xn)a.s.g(X) X_{n} \overset{D}{\to} X \implies g \left( X_{n} \right) \overset{D}{\to} g(X) \\ X_{n} \overset{P}{\to} X \implies g \left( X_{n} \right) \overset{P}{\to} g(X) \\ X_{n} \overset{\text{a.s.}}{\to} X \implies g \left( X_{n} \right) \overset{\text{a.s.}}{\to} g(X)


Explanation

The property that convergence holds even when applying continuous functions is a phenomenon that can be commonly observed in mathematics, no matter how convergence is defined. However, the term Continuous Mapping Theorem is predominantly used in the field of probability theory. A famous corollary is Slutsky’s Theorem, which is introduced only at an undergraduate level of mathematical statistics as a statement.

Slutsky’s Theorem2: For a constant a,ba,b and a random variable An,Bn,Xn,XA_{n}, B_{n} ,X_{n} , X, if anPaa_{n} \overset{P}{\to} a , BnPb B_{n} \overset{P}{\to} b , XnDX X_{n} \overset{D}{\to} X , then An+BnXnDa+bX A_{n} + B_{n} X_{n} \overset{D}{\to} a + b X

Although it is assumed that the fact itself can be used in basic subjects like undergraduate mathematical statistics, it is not easy to find a proof that is easy to understand without background knowledge, hence a proof involving measure theory was introduced. If you are an undergraduate student lacking understanding of real analysis, it is normal not to understand the proof, and there is no need to be disappointed. Instead, it is sufficient to think that you need to learn more difficult mathematics and to use it as a fact for the time being.

Proof

Convergence in Distribution

It can be obtained as a corollary to the portmanteau theorem.

Convergence in Probability

Fix ε>0\varepsilon > 0 and define the following set CgδCgC_{g}^{\delta} \subset C_{g} for any δ>0\delta > 0. Cgδ:={xCgy:yB(x;δ)g(y)B(g(x);ε)} C_{g}^{\delta}:= \left\{ x \in C_{g} \mid \exists y : y \in B \left( x;\delta \right) \land g(y) \notin B ' \left( g(x) ; \varepsilon \right) \right\} This set collects points xx where the function gg is continuous within a radius δ\delta, while it is possible to choose yy sufficiently far from both g(y)g(y) and g(x)g(x). Naturally, as δ>0\delta > 0 decreases, the likelihood of existence of such yy within the radius also decreases, and it is trivially limδ0Cgδ=\displaystyle \lim_{\delta \to 0} C_{g}^{\delta} = \emptyset. Now, assume d(g(X),g(Xn))εd’ \left( g(X) , g \left( X_{n} \right) \right) \ge \varepsilon for argument’s sake. At least one of the following three must be true:

  • (1): d(X,Xn)>δd \left( X , X_{n} \right) > \delta: Initially, XX and XnX_{n} are too far apart, so whether gg is continuous or not, g(X)g(X) and g(Xn) g \left( X_{n} \right) are also far apart.
  • (2): XCgδX \in C_{g}^{\delta}: Although XX is continuous, for XnX_{n} within radius δ\delta, the distance between g(Xn)g(X_{n}) and g(X)g (X) is far.
  • (3): XCgX \notin C_{g}: XX is not continuous, hence g(X)g(X) and g(Xn)g \left( X_{n} \right) are far apart.

Presenting this using probability P(d(g(Xn),g(X))>ε)P(d(Xn,X)δ)+P(XCgδ)+P(XCg) P \left( d’ \left( g \left( X_{n} \right) , g(X) \right) > \varepsilon \right) \le P \left( d \left( X_{n} , X \right) \ge \delta \right) + P \left( X \in C_{g}^{\delta} \right) + P \left( X \notin C_{g} \right) the right-hand side terms are

  • (1): since XnPXX_{n} \overset{P}{\to} X by premise, for all δ>0\delta >0 limnP(d(Xn,X)δ)=0 \lim_{n \to \infty} P \left( d \left( X_{n} , X \right) \ge \delta \right) = 0
  • (2): since it was said limδ0Cgδ=\displaystyle \lim_{\delta \to 0} C_{g}^{\delta} = \emptyset above limδ0P(XCgδ)=0 \lim_{\delta \to 0} P \left( X \in C_{g}^{\delta} \right) = 0
  • (3): since P(XCg)=1P \left( X \in C_{g} \right) = 1 by premise P(XCg)=P(XCgc)=0 P \left( X \notin C_{g} \right) = P \left( X \in C_{g}^{c} \right) = 0

Summarizing limnP(d(g(Xn),g(X))>ε)=0 \lim_{n \to \infty} P \left( d’ \left( g \left( X_{n} \right) , g (X) \right) > \varepsilon \right) = 0

Almost Sure Convergence

For points ωCg\omega \in C_{g} where gg is continuous, limnXn(ω)=X(ω)    limng(Xn(ω))=g(X(ω)) \lim_{n \to \infty} X_{n} (\omega) = X (\omega) \implies \lim_{n \to \infty} g \left( X_{n} (\omega) \right) = g \left( X (\omega) \right) Viewing as events and showing the inclusion [limnXn(ω)=X(ω)][limng(Xn(ω))=g(X(ω))] \left[ \lim_{n \to \infty} X_{n} (\omega) = X (\omega) \right] \subset \left[ \lim_{n \to \infty} g \left( X_{n} (\omega) \right) = g \left( X (\omega) \right) \right] since by premise Xna.s.XX_{n} \overset{\text{a.s.}}{\to} X, namely P(limnXn=X,XCg)=1\displaystyle P \left( \lim_{n \to \infty } X_{n} = X , X \in C_{g} \right) = 1 P[limng(Xn(ω))=g(X(ω))]P[limng(Xn(ω))=g(X(ω)),XCg]P[limnXn(ω)=X(ω),XCg]=1 \begin{align*} P \left[ \lim_{n \to \infty} g \left( X_{n} (\omega) \right) = g \left( X (\omega) \right) \right] \ge & P \left[ \lim_{n \to \infty} g \left( X_{n} (\omega) \right) = g \left( X (\omega) \right) , X \in C_{g} \right] \\ \ge & P \left[ \lim_{n \to \infty} X_{n} (\omega) = X (\omega) , X \in C_{g} \right] \\ =& 1 \end{align*}


  1. https://en.wikipedia.org/wiki/Continuous_mapping_theorem ↩︎

  2. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p306. ↩︎