logo

Consistent Estimator 📂Mathematical Statistics

Consistent Estimator

Definition 1

Let the random variable XX have a cumulative distribution function F(x;θ),θΘF ( x ; \theta), \theta \in \Theta. When X1,,XnX_{1} , \cdots , X_{n} is drawn from XX, the statistic TnT_{n} satisfies the following for the parameter θ\theta, it is said to be a Consistent Estimator.

TnPθas n T_{n} \overset{P}{\to} \theta \quad \text{as } n \to \infty


  • P\overset{P}{\to} is probabilistic convergence.

Explanation

If the unbiased estimator discusses the estimator from the concept of the expected value, the consistent estimator discusses whether the statistic itself converges to the parameter in terms of the concept of limits in [analysis]…(../1186), to be more precise, through the uniform convergence of a sequence of functions.

1n1k=1n(Xk2Xn)2Pσ2🤔?1nk=1n(Xk2Xn)2Pσ2🤔! \begin{align*} {{ 1 } \over { n - 1 }} \sum_{k=1}^{n} \left( X_{k}^{2} - \overline{X}_{n} \right)^{2} \overset{P}{\to}& \sigma^{2} \qquad \cdots 🤔 ? \\ {{ 1 } \over { n }} \sum_{k=1}^{n} \left( X_{k}^{2} - \overline{X}_{n} \right)^{2} \overset{P}{\to}& \sigma^{2} \qquad \cdots 🤔 ! \end{align*} As a simple example, looking at the following theorem, in the process of proving, the denominator of the sample variance SnS_{n} defined not as the degree of freedom (n1)(n-1) but as nn still poses no problem as a consistent estimator. This is mathematically explaining the intuition that ‘after all, if nn grows, wouldn’t nn and (n1)(n-1) be essentially the same?’, but to justify this intuition with the following theorem, the existence of skewness is necessary.

Theorem

Consistency of Sample Variance

If X1,,XnX_{1} , \cdots , X_{n} is a random sample following the probability distribution (μ,σ2)\left( \mu, \sigma^{2} \right), that is, X1,,Xniid(μ,σ2)X_{1} , \cdots , X_{n} \overset{\text{iid}}{\sim} \left( \mu, \sigma^{2} \right), and if skewness exists, then the sample variance Sn2S_{n}^{2} is a consistent estimator for the population variance σ2\sigma^{2}: SnPσ2as n S_{n} \overset{P}{\to} \sigma^{2} \quad \text{as } n \to \infty

Proof 2

Since X1,,XnX_{1} , \cdots , X_{n} are iid random samples, hence independent, the sample variance SnS_{n} can be represented as follows. Sn2=1n1k=1n(Xk2Xn)2=nn1[1nk=1nXk2Xn2] \begin{align*} S_{n}^{2} =& {{ 1 } \over { n - 1 }} \sum_{k=1}^{n} \left( X_{k}^{2} - \overline{X}_{n} \right)^{2} \\ =& {{ n } \over { n - 1 }} \left[ {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} - \overline{X}_{n}^{2} \right] \end{align*}

Weak Law of Large Numbers: If {Xk}k=1n\left\{ X_{k} \right\}_{k=1}^{n} are iid random variables following the probability distribution (μ,σ2)\left( \mu, \sigma^2 \right) , then XnPμ \overline{X}_n \overset{P}{\to} \mu

Continuous Mapping Theorem: XnPX    g(Xn)Pg(X)X_{n} \overset{P}{\to} X \implies g \left( X_{n} \right) \overset{P}{\to} g(X) Continuity and Limits: For the function f:XYf:X \to Y, the following conditions are equivalent.

  • f:XYf : X \to Y is continuous.
  • xX, limnpn=p    limnf(pn)=f(p)\forall x \in X,\ \displaystyle \lim_{n \to \infty} p_{n} = p \implies \lim_{n \to \infty} f(p_{n}) = f(p)

Since a polynomial function λ(x)=x2\lambda (x) = x^{2} whose value is a square is a continuous function, the following holds when nn \to \infty, according to the Continuous Mapping Theorem and Weak Law of Large Numbers. Xn2Pμ2 \overline{X}_{n}^{2} \overset{P}{\to} \mu^{2}

The Continuous Mapping Theorem can be challenging to understand at the undergraduate level and can be passed over as similar to the properties of continuous functions discussed in Introduction to Analysis.

Definition and Equivalent Conditions of Probabilistic Convergence: When a random variable XX and a sequence of random variables {Xn}\left\{ X_{n} \right\} satisfy the following, it is said that XnX_{n} converges in probability to XX when nn \to \infty, represented as XnPXX_{n} \overset{P}{\to} X. ε>0,limnP[XnX<ε]=1 \forall \varepsilon > 0 , \lim_{n \to \infty} P \left[ \left| X_{n} - X \right| < \varepsilon \right] = 1 The following expression, equivalent when used in formulas, is more frequently preferred. ε>0,limnP[XnXε]=0 \forall \varepsilon > 0 , \lim_{n \to \infty} P \left[ \left| X_{n} - X \right| \ge \varepsilon \right] = 0

Chebyshev’s Inequality: If the variance σ2<\sigma^2 < \infty of a random variable XX exists, then for μ:=E(X)\mu := E(X) and any positive number K>0K>0, P(XμKσ)1K2 \displaystyle P(|X-\mu| \ge K\sigma) \le {1 \over K^2}

Since the premise of the theorem includes the existence of skewness, implying that the fourth central moment E(X14)<E \left( X_{1}^{4} \right) < \infty exists, the variance of Xk2\sum X_{k}^{2} can generally be represented as proportional to the population variance of X1X_{1}’s sample variance by some constant c2>0c^{2} > 0 as c2σ4c^{2} \sigma^{4}. Rewritten in formula, it is 1nk=1nXk2(E(X12),c2σ4n) {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} \sim \left( E \left( X_{1}^{2} \right) , {{ c^{2} \sigma^{4} } \over { n }} \right) and for any given ε>0\varepsilon > 0, according to Chebyshev’s inequality, some positive K:=nε/cσ2K := n \varepsilon / c \sigma^{2} exists such that ε>0,P(1nk=1nXk2E(X12)Kcσ2)1K2    ε>0,P(1nk=1nXk2E(X12)ε)c2σ4n2ε2    ε>0,limnP(1nk=1nXk2E(X12)ε)=0    1nk=1nXk2PE(X12) \begin{align*} & \forall \varepsilon > 0, P \left( \left| {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} - E \left( X_{1}^{2} \right) \right| \ge K c \sigma^{2} \right) \le {{ 1 } \over { K^{2} }} \\ \implies & \forall \varepsilon > 0, P \left( \left| {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} - E \left( X_{1}^{2} \right) \right| \ge \varepsilon \right) \le {{ c^{2} \sigma^{4} } \over { n^{2} \varepsilon^{2} }} \\ \implies & \forall \varepsilon > 0, \lim_{n \to \infty} P \left( \left| {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} - E \left( X_{1}^{2} \right) \right| \ge \varepsilon \right) = 0 \\ \implies & {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} \overset{P}{\to} E \left( X_{1}^{2} \right) \end{align*} Summarizing, Sn2=nn1[1nk=1nXk2Xn2]P1[E(X12)μ2]=σ2 \begin{align*} S_{n}^{2} =& {{ n } \over { n - 1 }} \left[ {{ 1 } \over { n }} \sum_{k=1}^{n} X_{k}^{2} - \overline{X}_{n}^{2} \right] \\ \overset{P}{\to}& 1 \cdot \left[ E \left( X_{1}^{2} \right) - \mu^{2} \right] = \sigma^{2} \end{align*} and Sn2S_{n}^{2} is a consistent estimator for the population variance σ2\sigma^{2}. Here, in the part about n/(n1)1n / (n-1) \to 1, it can be understood that in fact, the sample variance could be divided by an appropriate constant ana \ne n as (n+a)(n+a) without problem as a consistent estimator.


  1. Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p298. ↩︎

  2. Hogg et al. (2018). Introduction to Mathematical Statistcs(8th Edition): p325. ↩︎