When random variablesX and stochastic processes{Xn}n∈N defined in a probability space(Ω,F,P) are n→∞, for all f∈Cb(S), if the following is satisfied, then it is said to Converge in DistributionX and is denoted as Xn→DX.
∫Ωf(Xn)dP→∫Ωf(X)dP
Cb(S) represents the set of bounded continuous functions defined by S.
Cb(S):={f:S→R∣f is bounded and continuous}
Theorem
[1]: If the sequence{Pn}n∈N of probability measures defined in (S,S) satisfies
Pn(X−1(B)):=P(Xn−1(B))
for all Borel setsB∈B(R), then the following holds.
Xn→DX⟺Pn→WP
[2]: It is equivalent that Xn→DX and every subsequence {Xn′}⊂{Xn} of all {Xn} has a subsequence {Xn′′}⊂{Xn′} satisfying Xn′′→DX. In formulaic terms, it is expressed as follows.
Xn→DX⟺∀{Xn′}⊂{Xn},∃{Xn′′}⊂{Xn′}:Xn′′→DX
[3] Continuous Mapping Theorem: Define Ch:={x∈S:h is continuous at x} as the set of points where h is continuous for measurable functionsh:(S,S)→(S′,S′). If Xn→DX and P(X∈Ch)=1, then h(Xn)→Dh(X). Expressed in formulas, it is as follows.
Xn→DX∧P(X∈Ch)=1⟹h(Xn)→Dh(X)
Description
[1]: As introduced in the theorem, a Pn defined this way is called an Induced Probability Measure. It’s important to note that Xn→DX distinguishes between the convergence of probability ‘variables’ and Pn→WP, the convergence of probability ‘measures’.
[2]: At first glance, this theorem might seem forced, but it becomes an important property when considered alongside the concept of relative compactness.
[3]: Continuous Mapping Theorem is actually generalizable to almost sure convergence, in addition to probability convergence. Given that h is also a function, and probability variables are functions as well, one should be able to naturally consider the use of composite functions h∘X. Contemplating whether the following formula makes sense for A∈S′ and grasping the process of understanding it is necessary.
P(h(X)−1(A))=P(X∈h−1(A))
The notation that distributions are the same for all f∈Cb(S) is sometimes used as =D. Its definition is as follows for all A∈S′ and continuous functions h:S→S′.
h(X)=Dh(Y)⟺defP(h(X)−1(A))=P(h(Y)−1(A))
Thinking back on the expression of convergence, it proceeds as follows.
h(Xn)→Dh(X)⟺P(h(Xn)−1(A))→P(h(X)−1(A))
Proof
[1]
For all f∈Cb(S)Pn→WP⟺⟺⟺∫SfdPn→∫SfdP∫Ωf(Xn)dP→∫Ωf(X)dPXn→DX
■
[2]
Assume that there exists f∈Cb(S) for which Xn→DX does not hold
∫Ωf(Xn)dP→∫Ωf(X)dP. That is, assuming
∫Ωf(Xn′)dP−∫Ωf(X)dP>ε
that there exists a subsequence index {n′} satisfying ε>0. However, this is a contradiction because there always exists a subsequence of subsequence indexes {n′′} satisfying
∫Ωf(Xn′′)dP→∫Ωf(X)dP
It is trivially true if we set (⟸) as {n′′}={n}.
■
[3]
Let’s denote the probability measure induced by X as PX(A):=P(X−1(A))=P(X∈A).
h−1(B)⊂h−1(B)∪Chc
Considering all closed sets B in S′, the above inclusion relationship holds. For an arbitrary x∈h−1(B), since hpreserves closure for the continuous part, containing h−1(B) and the preimage of the non-continuous part includes Chc. Since the closureh−1(B) is a closed set in S=====≤n→∞limsupP(h(Xn)∈B)n→∞limsupP(Xn∈h−1(B))n→∞limsupPX(h(Xn)−1(B))n→∞limsupPX([Xn−1∘h−1](B))n→∞limsupPX(Xn−1(h−1(B)))n→∞limsupPn(h−1(B))n→∞limsupPn(h−1(B))
(5): For all P(∂A)=0, n→∞limPn(A)=P(A) of every A
Following [1], if Xn→DX, then Pn→WPX, and by the assumption PX(X∈Chc)=0 and (1)⟹(3) of the Portmanteau theorem
n→∞limsupPX(h(Xn)−1(B))≤≤≤≤≤≤≤n→∞limsupPn(h−1(B))PX(h−1(B))PX(h−1(B)∪Chc)PX(h−1(B))+PX(Chc)PX(h−1(B))PX(X−1(h−1(B)))PX((h(X))−1(B))
Showing PX((h(X))−1(B))≤n→∞liminfPX(h(Xn)−1(B)) by the same method
n→∞limPX(h(Xn)−1(B))=PX((h(X))−1(B))