logo

Inequalities of Martingales 📂Probability Theory

Inequalities of Martingales

Theorem

Let {(Xn,Fn)}\left\{ (X_{n} , \mathcal{F}_{n}) \right\} be a supermartingale.

  • [1]: For all λ>0\lambda > 0, λP(maxnNXnλ)EX1(maxnNXn<λ)XNdPEX1+EXN a.s. \begin{align*} \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) \le & E X_{1} - \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N} dP \\ \le & E X_{1} + E X_{N}^{-} \text{ a.s.} \end{align*}
  • [2]: For all λ>0\lambda > 0, λP(minnNXnλ)(minnNXnλ)XNdPEXN a.s. \begin{align*} \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right) \le & - \int_{(\min_{n \le N} X_{n} \le - \lambda)} X_{N} dP \\ \le & E X_{N}^{-} \text{ a.s.} \end{align*}

Let {(Xn,Fn)}\left\{ (X_{n} , \mathcal{F}_{n}) \right\} be a submartingale.

  • [3]: For all λ>0\lambda > 0, λP(maxnNXnλ)(maxnNXnλ)XNdPEXN a.s. \begin{align*} \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) \le & \int_{(\max_{n \le N} X_{n} \ge \lambda)} X_{N} dP \\ \le & E \left| X_{N} \right| \text{ a.s.} \end{align*}
  • [4]: For all λ>0\lambda > 0, λP(minnNXnλ)EXn+EX1 a.s. \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right) \le E X_{n}^{+} - E X_{1} \text{ a.s.}

Let {(Xn,Fn)}\left\{ (X_{n} , \mathcal{F}_{n}) \right\} be a martingale.

  • [5]: For all λ>0\lambda > 0, P(maxnNXnλ)1λpEXNp a.s. P \left( \max_{n \le N} | X_{n} | \ge \lambda \right) \le {{1} \over {\lambda^{p}}} E |X_{N}|^{p} \text{ a.s.}

Explanation

The inequalities introduced here can be similarly used by dividing both sides by λ\lambda to specify bounds of probability. Notably, since a martingale is both a supermartingale and a submartingale, all the above inequalities can be utilized.

  • [3]: According to the Markov inequality, we already know P(maxnNXnλ)1λEmaxnNXn\displaystyle P(\max_{n \le N} X_{n} \ge \lambda ) \le {{ 1 } \over { \lambda }} E \left| \max_{n \le N} X_{n} \right|, but with the condition being a submartingale - adding information, we can consider the reduction to bounds by just looking at NN as in P(maxnNXnλ)1λEXN\displaystyle P(\max_{n \le N} X_{n} \ge \lambda ) \le {{ 1 } \over { \lambda }} E \left| X_{N} \right|.
  • [4]: Conditions of submartingale-supermartingale contrast with [1] and can be considered as a pair.
  • [5]: This formula, similar to the generalized Markov inequality P(XpCp)1CpEXp\displaystyle P(|X|^{p} \ge C^{p}) \le {{ 1 } \over { C^{p} }} E | X |^{p}, allows for reducing probability bounds as pp increases given λ>1\lambda>1, but one must remember that XNX_{N} moments of pp must exist beforehand.

Proof

Strategy [1], [2]: Considering the bounded NN, we start from the following theorem.

Optional Sampling Theorem: If {(Xn,Fn)}\left\{ ( X_{n} , \mathcal{F}_{n} ) \right\} is a supermartingale and τ\tau and σ\sigma are bounded stopping times such that στ\sigma \le \tau and for Fn\mathcal{F}_{n}, then E(XτFσ)Xσ a.s. E \left( X_{\tau} | \mathcal{F}_{\sigma} \right) \le X_{\sigma} \text{ a.s.}

In the proof, the defined TT is the moment when XnX_{n} exceeds λ\lambda for the first time, considering abandoning the count if it exceeds NN and simply taking NN. Understanding this is the crux, and this forms the basis for using a trick to split the integration domain.

[1]

Part 1. Definition of stopping time TT

Define the random variable T:={inf{nN:Xnλ},n<NN,otherwise T:= \begin{cases} \inf \left\{ n \le N: X_{n} \ge \lambda \right\} &, n < N \\ N &, \text{otherwise} \end{cases}. Now for all kNk \in \mathbb{N}, check if (T=k)Fk(T = k) \in \mathcal{F}_{k} to demonstrate that TT is a stopping time. For k=n<Nk = n < N, (T=n)=(X1<λ,,Xn1<λ,Xnλ)Fn (T = n) = \left( X_{1} < \lambda , \cdots , X_{n-1} < \lambda, X_{n} \ge \lambda \right) \in \mathcal{F}_{n} and for k=Nk = N, it is sufficient to check up to the k=N1k = N-1th to see if it is Xk<λX_{k} < \lambda, so (T=N)=(X1<λ,,Xn1<λ)FN1FN (T = N) = \left( X_{1} < \lambda , \cdots , X_{n-1} < \lambda \right) \in \mathcal{F}_{N-1} \subset \mathcal{F}_{N}


Part 2.

According to the optional sampling theorem, almost surely, E(XTF1)X1E \left( X_{T} | \mathcal{F}_{1} \right) \le X_{1} hence ΩX1dPΩE(XTF1)dP=ΩXTdP=(maxnNXnλ)XTdP+(maxnNXn<λ)XTdP a.s. \begin{align*} \int_{\Omega} X_{1} dP \ge& \int_{\Omega} E \left( X_{T} | \mathcal{F}_{1} \right) dP \\ =& \int_{\Omega} X_{T} dP \\ =& \int_{(\max_{n \le N} X_{n} \ge \lambda)} X_{T} dP + \int_{(\max_{n \le N} X_{n} < \lambda)} X_{T} dP\text{ a.s.} \end{align*} The last line of the above development divides the integration domain into (maxnNXnλ)(\max_{n \le N} X_{n} \ge \lambda) and (maxnNXn<λ)(\max_{n \le N} X_{n} < \lambda). Then, by the definition of TT, on (maxnNXnλ)(\max_{n \le N} X_{n} \ge \lambda) it is XTλX_{T} \ge \lambda and on (maxnNXn<λ)(\max_{n \le N} X_{n} < \lambda) it is XT=XNX_{T} = X_{N}, so ΩX1dP(maxnNXnλ)XTdP+(maxnNXn<λ)XTdP=(maxnNXnλ)λdP+(maxnNXn<λ)XNdP=λP(maxnNXnλ)+(maxnNXn<λ)XNdP a.s. \begin{align*} \int_{\Omega} X_{1} dP \ge& \int_{(\max_{n \le N} X_{n} \ge \lambda)} X_{T} dP + \int_{(\max_{n \le N} X_{n} < \lambda)} X_{T} dP \\ =& \int_{(\max_{n \le N} X_{n} \ge \lambda)} \lambda dP + \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N} dP \\ =& \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) + \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N} dP \text{ a.s.} \end{align*}


Part 3.

Simplifying the terms yields λP(maxnNXnλ)ΩX1dP(maxnNXn<λ)XNdP=EX1(maxnNXn<λ)XN+dP+(maxnNXn<λ)XNdP a.s. \begin{align*} \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) \le & & \int_{\Omega} X_{1} dP - \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N} dP \\ =& E X_{1} - \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N}^{+} dP + \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N}^{-} dP \text{ a.s.} \end{align*} Since XN+0X_{N}^{+} \ge 0, eliminate (maxnNXn<λ)XN+dP\displaystyle - \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N}^{+} dP and (maxnNXn<λ)XNdPΩXNdP=EXN \int_{(\max_{n \le N} X_{n} < \lambda)} X_{N}^{-} dP \le \int_{\Omega} X_{N}^{-} dP = E X_{N}^{-} thus λP(maxnNXnλ)EX1+EXN a.s. \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) \le E X_{1} + E X_{N}^{-} \text{ a.s.}

[2]

Part 1. Definition of stopping time TT

Define the random variable T:={inf{nN:Xnλ},n<NN,otherwise T:= \begin{cases} \inf \left\{ n \le N: X_{n} \le - \lambda \right\} &, n < N \\ N &, \text{otherwise} \end{cases}. Now for all kNk \in \mathbb{N}, check if (T=k)Fk(T = k) \in \mathcal{F}_{k} to demonstrate that TT is a stopping time. For k=n<Nk = n < N, (T=n)=(X1>λ,,Xn1>λ,Xnλ)Fn (T = n) = \left( X_{1} > - \lambda , \cdots , X_{n-1} > - \lambda, X_{n} \le - \lambda \right) \in \mathcal{F}_{n} and for k=Nk = N, it is sufficient to check up to the k=N1k = N-1th to see if it is Xk>λX_{k} > - \lambda, so (T=N)=(X1>λ,,Xn1>λ)FN1FN (T = N) = \left( X_{1} > - \lambda , \cdots , X_{n-1} > - \lambda \right) \in \mathcal{F}_{N-1} \subset \mathcal{F}_{N} and TNT \le N is true.


Part 2.

Property of Conditional Expectation: For all sigma fields G\mathcal{G}, E[E(XG)]=E(X)E \left[ E ( X | \mathcal{G} ) \right] = E(X)

According to the optional sampling theorem, almost surely E(XNFT)XnE \left( X_{N} | \mathcal{F}_{T} \right) \le X_{n}, and taking the expectation EE on both sides similar to Part 2. of proof [1], EXNEXT=ΩXTdP=(minnNXnλ)XTdP+(minnNXn>λ)XTdPλP(minnNXnλ)+(minnNXn>λ)XNdP \begin{align*} E X_{N} \le & E X_{T} \\ =& \int_{\Omega} X_{T} dP \\ =& \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{T} dP + \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{T} dP \\ \le & - \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right)+ \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{N} dP \end{align*} But since EXN=(minnNXnλ)XNdP+(minnNXn>λ)XNdP\displaystyle E X_{N} = \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N} dP + \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{N} dP, (minnNXnλ)XNdP+(minnNXn>λ)XNdPλP(minnNXnλ)+(minnNXn>λ)XNdP \begin{align*} & \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N} dP + \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{N} dP \\ \le & - \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right)+ \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{N} dP \end{align*} eliminating (minnNXn>λ)XNdP\displaystyle \int_{(\min_{n \le N} X_{n} > - \lambda )} X_{N} dP from both sides yields (minnNXnλ)XNdPλP(minnNXnλ) \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N} dP \le - \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right)


Part 3.

Reversing the sign gives λP(minnNXnλ)(minnNXnλ)XNdP=(minnNXnλ)XN+dP+(minnNXnλ)XNdP \begin{align*} \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right) \le & - \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N} dP \\ =& - \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N}^{+} dP + \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N}^{-} dP \end{align*} Similarly to Part 3. of proof [1], since XN+0X_{N}^{+} \ge 0, eliminate (minnNXnλ)XN+dP\displaystyle - \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N}^{+} dP and (minnNXnλ)XNdPΩXNdP=EXN \int_{(\min_{n \le N} X_{n} \le - \lambda )} X_{N}^{-} dP \le \int_{\Omega} X_{N}^{-} dP = E X_{N}^{-} thus λP(minnNXnλ)EXN a.s. \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right) \le E X_{N}^{-} \text{ a.s.}

[3]

λP(maxnNXnλ)=λP(maxnNXnλ)=λP(minnN(Xn)λ) \begin{align*} \lambda P \left( \max_{n \le N} X_{n} \ge \lambda \right) =& \lambda P \left( - \max_{n \le N} X_{n} \le - \lambda \right) \\ =& \lambda P \left( \min_{n \le N} (-X_{n}) \le - \lambda \right) \end{align*} Since {(Xn,Fn)}\left\{ \left( -X_{n} , \mathcal{F}_{n} \right) \right\} is a supermartingale, according to [2] λP(minnN(Xn)λ)(minnNXnλ)(XN)dP(maxnNXnλ)XNdP(XN0)XNdPEXN+EXN a.s. \begin{align*} \lambda P \left( \min_{n \le N} (-X_{n}) \le - \lambda \right) \le & - \int_{(\min_{n \le N} X_{n} \le - \lambda)} (-X_{N}) dP \\ \le & \int_{(\max_{n \le N} X_{n} \ge \lambda)} X_{N} dP \\ \le & \int_{(X_{N} \ge 0)} X_{N} dP \\ \le & E X_{N}^{+} \\ \le & E \left| X_{N} \right| \text{ a.s.} \end{align*}

[4]

Since {(Xn,Fn)}\left\{ \left( -X_{n} , \mathcal{F}_{n} \right) \right\} is a supermartingale, according to [1] λP(minnNXnλ)=λP(maxnN(Xn)λ)E(X1)+E(XN)=E(XN)EX1=E(XN++XN)EX1=E(XN)+EX1 a.s. \begin{align*} \lambda P \left( \min_{n \le N} X_{n} \le - \lambda \right) =& \lambda P \left( \max_{n \le N} \left( -X_{n} \right) \ge \lambda \right) \\ \le & E \left( -X_{1} \right) + E \left( -X_{N} \right)^{-} \\ =& E \left( -X_{N} \right)^{-} - E X_{1} \\ =& E \left( -X_{N}^{+} + X_{N}^{-} \right)^{-} - E X_{1} \\ =& E \left( X_{N} \right)^{+} - E X_{1} \text{ a.s.} \end{align*}

[5]

According to the conditional Jensen’s inequality corollary, since {(Xnp,Fn)}\left\{ \left( |X_{n}|^{p} , \mathcal{F}_{n} \right) \right\} is a submartingale, according to [3] λpP(maxnNXnpλp)EXNp a.s.    P(maxnNXnpλp)1λpEXNp a.s.    P(maxnNXnλ)1λpEXNp a.s. \begin{align*} & \lambda^{p} P \left( \max_{n \le N} | X_{n} |^{p} \ge \lambda^{p} \right) \le E \left| X_{N} \right|^{p} \text{ a.s.} \\ \iff & P \left( \max_{n \le N} | X_{n} |^{p} \ge \lambda^{p} \right) \le {{ 1 } \over { \lambda^{p} }} E \left| X_{N} \right|^{p} \text{ a.s.} \\ \iff & P \left( \max_{n \le N} | X_{n} | \ge \lambda \right) \le {{ 1 } \over { \lambda^{p} }} E \left| X_{N} \right|^{p} \text{ a.s.} \end{align*}