logo

Probability Convergence in Mathematical Statistics 📂Mathematical Statistics

Probability Convergence in Mathematical Statistics

Definition 1

A random variable XX and a sequence of random variables {Xn}\left\{ X_{n} \right\} are said to converge in probability to XX as nn \to \infty if they satisfy the following, and it is denoted by XnPXX_{n} \overset{P}{\to} X. ε>0,limnP[XnX<ε]=1 \forall \varepsilon > 0 , \lim_{n \to \infty} P \left[ \left| X_{n} - X \right| < \varepsilon \right] = 1

Explanation

The condition for convergence in probability is exactly as it’s defined in terms of probabilities, which simply means that as nn increases, the two random variables are likely to be equal with a very small error ε\varepsilon. This is precisely what is meant by convergence in probability. In equations, this is represented by the following equivalent but more convenient expression. ε>0,limnP[XnXε]=0 \forall \varepsilon > 0 , \lim_{n \to \infty} P \left[ \left| X_{n} - X \right| \ge \varepsilon \right] = 0 As is known, a random variable is a function from a sample space to real numbers, and comparing two functions in terms of their difference ε\varepsilon makes it analogous to the uniform convergence of functions in analytical terms. This analogy extends to the fact that if there is uniform convergence, there is pointwise convergence, just as if there is convergence in probability, there is convergence in distribution. If the sudden appearance of epsilon is unwelcome, it’s time to get familiar with it or give up on mathematical statistics. In statistics, saying nn increases is not just about sending some number to infinity, it mathematically represents the assumption of having a sufficiently large sample size, and if one cannot discuss the sample size in theoretical statistical development using probability theory, then there’s essentially nothing to be done. No matter how awkward analysis may seem to the reader, an effort should be made to at least read and understand Part 1. of the proof [3] presented in this post. Here are some intuitive properties of probability convergence.

Theorem

Let’s assume XnPXX_{n} \overset{P}{\to} X.

  • [1] Continuous Mapping Theorem: For a continuous function gg, g(Xn)Pg(X) g\left( X_{n} \right) \overset{P}{\to} g (X)
  • [2]: Convergence in probability implies convergence in distribution. That is, XnPX    XnDX X_{n} \overset{P}{\to} X \implies X_{n} \overset{D}{\to} X
  • [3]: If aRa \in \mathbb{R} is constant and YnPY Y_{n} \overset{P}{\to} Y, aXnPaXXn+YnPX+YXnYnPXY aX_{n} \overset{P}{\to} a X \\ X_{n} + Y_{n} \overset{P}{\to} X + Y \\ X_{n} Y_{n} \overset{P}{\to} XY

Proof

[1]

There are proofs beyond the undergraduate level, and it’s not necessary to delve into the depth of mathematical statistics to understand. It’s acceptable to just accept and move on.

[2]

Direct deduction.

[3]

Part 1. aXnPaXaX_{n} \overset{P}{\to} a X

Although it can also be directly concluded from the Continuous Mapping Theorem, we choose to conduct a direct deduction as an example of an analytical proof. It’s trivially true if a=0a = 0, so let us assume a0a \ne 0.

If we define ε>0\varepsilon > 0, then by dividing a|a| inside the probability of PP, we obtain the following equation. P(aXnaXε)=P(aXnXε)=P(XnXεa) \begin{align*} P \left( \left| a X_{n} - aX \right| \ge \varepsilon \right) =& P \left( |a| \left| X_{n} - X \right| \ge \varepsilon \right) \\ =& P \left( \left| X_{n} - X \right| \ge {{ \varepsilon } \over { |a| }} \right) \end{align*} Given the assumption that XnPXX_{n} \overset{P}{\to} X as nn \to \infty, the last term converges to 00 as nn \to \infty, hence taking the limit for the first term yields the following. limnP(aXnaXε)=0 \lim_{n \to \infty} P \left( \left| a X_{n} - aX \right| \ge \varepsilon \right) = 0


Part 2. Xn+YnPX+YX_{n} + Y_{n} \overset{P}{\to} X + Y

It’s not too difficult as long as you don’t confuse the direction of the inequality. According to the Triangle Inequality, (XnX)+(YnY)XnX+YnY \left| \left( X_{n} - X \right) + \left( Y_{n} - Y \right) \right| \le \left| X_{n} - X \right| + \left| Y_{n} - Y \right| Following the diagram below20200921\_011718.png the inclusion relation of the two events (XnX+YnYε)[(XnXε/2)(YnYε/2)] \color{blue}{\left( \left| X_{n} - X \right| + \left| Y_{n} - Y \right| \ge \varepsilon \right) } \subset \color{orange}{ \left[ \left( \left| X_{n} - X \right| \ge \varepsilon / 2 \right) \cup \left( \left| Y_{n} - Y \right| \ge \varepsilon / 2 \right) \right] } is evident. Now assuming ε(XnX)+(YnY)\varepsilon \le \left| \left( X_{n} - X \right) + \left( Y_{n} - Y \right) \right|, P[(Xn+Yn)(X+Y)ε]=P[(XnX)+(YnY)ε]P[XnX+YnYε]P[(XnXε/2)(YnYε/2)]P[XnXε/2]+P[YnYε/2] \begin{align*} P \left[ \left| \left( X_{n} + Y_{n} \right) - \left( X + Y \right) \right| \ge \varepsilon \right] =& P \left[ \left| \left( X_{n} - X \right) + \left( Y_{n} - Y \right) \right| \ge \varepsilon \right] \\ \le & P \left[ \color{blue}{ \left| X_{n} - X \right| + \left| Y_{n} - Y \right| \ge \varepsilon } \right] \\ \le & P \left[ \color{orange}{ \left( \left| X_{n} - X \right| \ge \varepsilon / 2 \right) \cup \left( \left| Y_{n} - Y \right| \ge \varepsilon / 2 \right) } \right] \\ \le & P \left[ \left| X_{n} - X \right| \ge \varepsilon / 2 \right] + P \left[ \left| Y_{n} - Y \right| \ge \varepsilon / 2 \right] \end{align*} since the last term converges to 00 as nn \to \infty, we obtain the following. limnP[(Xn+Yn)(X+Y)ε]0 \lim_{n \to \infty} P \left[ \left| \left( X_{n} + Y_{n} \right) - \left( X + Y \right) \right| \ge \varepsilon \right] \le 0


Part 3. XnYnPXYX_{n} Y_{n} \overset{P}{\to} XY

g(x):=x2 g(x) := x^{2} is a continuous function, so by theorem [1] Xn2PX2X_{n}^{2} \overset{P}{\to} X^{2}, and XnYn=12Xn2+12Yn212(XnYn)2P12X2+12Y212(XY)2=XY \begin{align*} X_{n} Y_{n} =& {{ 1 } \over { 2 }} X_{n}^{2} + {{ 1 } \over { 2 }} Y_{n}^{2} - {{ 1 } \over { 2 }} \left( X_{n} - Y_{n} \right)^{2} \\ \overset{P}{\to}& {{ 1 } \over { 2 }} X^{2} + {{ 1 } \over { 2 }} Y^{2} - {{ 1 } \over { 2 }}\left( X - Y \right)^{2} \\ =& XY \end{align*}

Rigorous Definition


  1. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p295. ↩︎