logo

Convergence of Distributions Defined by Measure Theory 📂Probability Theory

Convergence of Distributions Defined by Measure Theory

Definition

Let’s define a measurable space (S,S)(S,\mathcal{S}) with respect to the Borel sigma field S:=B(S)\mathcal{S}:= \mathcal{B}(S) of a metric space SS.

When random variables XX and stochastic processes {Xn}nN\left\{ X_{n} \right\}_{n \in \mathbb{N}} defined in a probability space (Ω,F,P)(\Omega, \mathcal{F}, P) are nn \to \infty, for all fCb(S)f \in C_{b}(S), if the following is satisfied, then it is said to Converge in Distribution XX and is denoted as XnDXX_{n} \overset{D}{\to} X. Ωf(Xn)dPΩf(X)dP \int_{\Omega} f(X_{n}) dP \to \int_{\Omega} f(X) dP


  • Cb(S)C_{b}(S) represents the set of bounded continuous functions defined by SS. Cb(S):={f:SRf is bounded and continuous} C_{b}(S) := \left\{ f:S \to \mathbb{R} \mid f\text{ is bounded and continuous} \right\}

Theorem

  • [1]: If the sequence {Pn}nN\left\{ P_{n} \right\}_{n \in \mathbb{N}} of probability measures defined in (S,S)(S,\mathcal{S}) satisfies Pn(X1(B)):=P(Xn1(B)) P_{n} \left( X^{-1} (B) \right) := P \left( X_{n}^{-1} (B) \right) for all Borel sets BB(R)B \in \mathcal{B} \left( \mathbb{R} \right), then the following holds. XnDX    PnWP X_{n} \overset{D}{\to} X \iff P_{n} \overset{W}{\to} P
  • [2]: It is equivalent that XnDXX_{n} \overset{D}{\to} X and every subsequence {Xn}{Xn}\left\{ X_{n '} \right\} \subset \left\{ X_{n} \right\} of all {Xn}\left\{ X_{n} \right\} has a subsequence {Xn}{Xn}\left\{ X_{n ''} \right\} \subset \left\{ X_{n '} \right\} satisfying XnDXX_{n ''} \overset{D}{\to} X. In formulaic terms, it is expressed as follows. XnDX    {Xn}{Xn},{Xn}{Xn}:XnDX X_{n} \overset{D}{\to} X \iff \forall \left\{ X_{n '} \right\} \subset \left\{ X_{n} \right\}, \exists \left\{ X_{n ''} \right\} \subset \left\{ X_{n '} \right\} : X_{n ''} \overset{D}{\to} X
  • [3] Continuous Mapping Theorem: Define Ch:={xS:h is continuous at x}C_{h} : = \left\{ x \in S : h \text{ is continuous at } x \right\} as the set of points where hh is continuous for measurable functions h:(S,S)(S,S)h : (S , \mathcal{S}) \to (S ' , \mathcal{S} '). If XnDXX_{n} \overset{D}{\to} X and P(XCh)=1P(X \in C_{h}) = 1, then h(Xn)Dh(X)h(X_{n}) \overset{D}{\to} h(X). Expressed in formulas, it is as follows. XnDXP(XCh)=1    h(Xn)Dh(X) X_{n} \overset{D}{\to} X \land P(X \in C_{h}) = 1 \implies h(X_{n}) \overset{D}{\to} h(X)

Description

  • [1]: As introduced in the theorem, a PnP_{n} defined this way is called an Induced Probability Measure. It’s important to note that XnDXX_{n} \overset{D}{\to} X distinguishes between the convergence of probability ‘variables’ and PnWPP_{n} \overset{W}{\to} P, the convergence of probability ‘measures’.
  • [2]: At first glance, this theorem might seem forced, but it becomes an important property when considered alongside the concept of relative compactness.
  • [3]: Continuous Mapping Theorem is actually generalizable to almost sure convergence, in addition to probability convergence. Given that hh is also a function, and probability variables are functions as well, one should be able to naturally consider the use of composite functions hXh \circ X. Contemplating whether the following formula makes sense for ASA \in \mathcal{S} ' and grasping the process of understanding it is necessary. P(h(X)1(A))=P(Xh1(A)) P \left( h(X)^{-1} (A) \right) = P \left( X \in h^{-1}(A) \right) The notation that distributions are the same for all fCb(S)f \in C_{b}(S) is sometimes used as =D\overset{D}{=}. Its definition is as follows for all ASA \in \mathcal{S} ' and continuous functions h:SSh:S \to S'. h(X)=Dh(Y)    defP(h(X)1(A))=P(h(Y)1(A)) h(X) \overset{D}{=} h(Y) \overset{\text{def}}{\iff} P \left( h(X)^{-1}(A) \right) = P \left( h(Y)^{-1}(A) \right) Thinking back on the expression of convergence, it proceeds as follows. h(Xn)Dh(X)    P(h(Xn)1(A))P(h(X)1(A)) h\left( X_{n} \right) \overset{D}{\to} h(X) \iff P \left( h\left( X_{n} \right)^{-1}(A) \right) \to P \left( h(X)^{-1}(A) \right)

Proof

[1]

For all fCb(S)f \in C_{b}(S) PnWP    SfdPnSfdP    Ωf(Xn)dPΩf(X)dP    XnDX \begin{align*} P_{n} \overset{W}{\to} P \iff & \int_{S} f dP_{n} \to \int_{S} f dP \\ \iff & \int_{\Omega} f(X_{n}) dP \to \int_{\Omega} f(X) dP \\ \iff & X_{n} \overset{D}{\to} X \end{align*}

[2]

Assume that there exists fCb(S)f \in C_{b}(S) for which XnDXX_{n} \overset{D}{\to} X does not hold Ωf(Xn)dPΩf(X)dP \int_{\Omega} f(X_{n}) dP \to \int_{\Omega} f(X) dP . That is, assuming Ωf(Xn)dPΩf(X)dP>ε \left| \int_{\Omega} f(X_{n '}) dP - \int_{\Omega} f(X) dP \right| > \varepsilon that there exists a subsequence index {n}\left\{ n' \right\} satisfying ε>0\varepsilon > 0. However, this is a contradiction because there always exists a subsequence of subsequence indexes {n}\left\{ n'' \right\} satisfying Ωf(Xn)dPΩf(X)dP \int_{\Omega} f(X_{n ''}) dP \to \int_{\Omega} f(X) dP


It is trivially true if we set (    )(\impliedby) as {n}={n}\left\{ n'' \right\} = \left\{ n \right\}.

[3]

Let’s denote the probability measure induced by XX as PX(A):=P(X1(A))=P(XA)P_{X}(A) := P \left( X^{-1}(A) \right) = P(X \in A). h1(B)h1(B)Chc \overline{h^{-1}(B)} \subset h^{-1}(B) \cup C_{h}^{c} Considering all closed sets BB in SS', the above inclusion relationship holds. For an arbitrary xh1(B)x \in \overline{h^{-1}(B)}, since hh preserves closure for the continuous part, containing h1(B)h^{-1}(B) and the preimage of the non-continuous part includes ChcC_{h}^{c}. Since the closure h1(B)\overline{h^{-1}(B)} is a closed set in SS lim supnP(h(Xn)B)=lim supnP(Xnh1(B))=lim supnPX(h(Xn)1(B))=lim supnPX([Xn1h1](B))=lim supnPX(Xn1(h1(B)))=lim supnPn(h1(B))lim supnPn(h1(B)) \begin{align*} & \limsup_{n \to \infty} P \left( h ( X_{n} ) \in B \right) \\ =& \limsup_{n \to \infty} P \left( X_{n} \in h^{-1} (B) \right) \\ =& \limsup_{n \to \infty} P_{X} \left( h ( X_{n} )^{-1}(B) \right) \\ =& \limsup_{n \to \infty} P_{X} \left( \left[ X_{n}^{-1} \circ h^{-1} \right] (B) \right) \\ =& \limsup_{n \to \infty} P_{X} \left( X_{n}^{-1} \left( h^{-1} (B) \right) \right) \\ =& \limsup_{n \to \infty} P_{n} \left( h^{-1} (B)\right) \\ \le & \limsup_{n \to \infty} P_{n} \left( \overline{h^{-1} (B)} \right) \end{align*}

Portmanteau Theorem: Let’s say the space SS is both a metric space (S,ρ)( S , \rho) and a measurable space (S,B(S))(S,\mathcal{B}(S)). The following are all equivalent.

  • (1): PnWPP_{n} \overset{W}{\to} P
  • (2): For all bounded, uniformly continuous functions ff, SfdPnSfdP\displaystyle \int_{S} f dP_{n} \to \int_{S}f d P
  • (3): For all closed sets FF, lim supnPn(F)P(F)\displaystyle \limsup_{n\to\infty} P_{n}(F) \le P(F)
  • (4): For all open sets GG, P(G)lim infnPn(G)\displaystyle P(G) \le \liminf_{n\to\infty} P_{n}(G)
  • (5): For all P(A)=0P(\partial A) = 0, limnPn(A)=P(A)\displaystyle \lim_{n\to\infty} P_{n}(A) = P(A) of every AA

Following [1], if XnDXX_{n} \overset{D}{\to} X, then PnWPXP_{n} \overset{W}{\to} P_{X}, and by the assumption PX(XChc)=0P_{X}(X \in C_{h}^{c}) = 0 and (1)    (3)(1) \implies (3) of the Portmanteau theorem lim supnPX(h(Xn)1(B))lim supnPn(h1(B))PX(h1(B))PX(h1(B)Chc)PX(h1(B))+PX(Chc)PX(h1(B))PX(X1(h1(B)))PX((h(X))1(B)) \begin{align*} \limsup_{n \to \infty} P_{X} \left( h ( X_{n} )^{-1}(B) \right) \le & \limsup_{n \to \infty} P_{n} \left( \overline{h^{-1} (B)} \right) \\ \le & P_{X} \left( \overline{h^{-1} (B)} \right) \\ \le & P_{X} \left( h^{-1} (B) \cup C_{h}^{c} \right) \\ \le & P _{X}\left( h^{-1} (B) \right) + P_{X} \left( C_{h}^{c} \right) \\ \le & P_{X} \left( h^{-1} (B) \right) \\ \le & P_{X} \left( X^{-1} \left( h^{-1} (B) \right) \right) \\ \le & P_{X} \left( \left( h(X) \right)^{-1} (B) \right) \end{align*} Showing PX((h(X))1(B))lim infnPX(h(Xn)1(B))\displaystyle P_{X} \left( \left( h(X) \right)^{-1} (B) \right) \le \liminf_{n \to \infty} P_{X} \left( h ( X_{n} )^{-1}(B) \right) by the same method limnPX(h(Xn)1(B))=PX((h(X))1(B)) \lim_{n \to \infty} P_{X} \left( h ( X_{n} )^{-1}(B) \right) = P_{X} \left( \left( h(X) \right)^{-1} (B) \right)

See Also

Renewed

  • August 19, 2023, by Daeshik Ryu, corrected a typo in statement [1] (Pn(A):=P(Xn1(A))P_{n} (A) := P \left( X_{n}^{-1} (A)\right)Pn(X1(B)):=P(Xn1(B))P_{n} \left( X^{-1} (B) \right) := P \left( X_{n}^{-1} (B)\right))