Let the random variable X have a cumulative distribution function F(x;θ),θ∈Θ. When X1,⋯,Xn is drawn from X, the statistic Tn satisfies the following for the parameter θ, it is said to be a Consistent Estimator.
Tn→Pθas n→∞
→P is probabilistic convergence.
Explanation
If the unbiased estimator discusses the estimator from the concept of the expected value, the consistent estimator discusses whether the statistic itself converges to the parameter in terms of the concept of limits in [analysis]…(../1186), to be more precise, through the uniform convergence of a sequence of functions.
n−11k=1∑n(Xk2−Xn)2→Pn1k=1∑n(Xk2−Xn)2→Pσ2⋯🤔?σ2⋯🤔!
As a simple example, looking at the following theorem, in the process of proving, the denominator of the sample varianceSn defined not as the degree of freedom(n−1) but as n still poses no problem as a consistent estimator. This is mathematically explaining the intuition that ‘after all, if n grows, wouldn’t n and (n−1) be essentially the same?’, but to justify this intuition with the following theorem, the existence of skewness is necessary.
Since X1,⋯,Xn are iid random samples, hence independent, the sample variance Sn can be represented as follows.
Sn2==n−11k=1∑n(Xk2−Xn)2n−1n[n1k=1∑nXk2−Xn2]
The Continuous Mapping Theorem can be challenging to understand at the undergraduate level and can be passed over as similar to the properties of continuous functions discussed in Introduction to Analysis.
Definition and Equivalent Conditions of Probabilistic Convergence: When a random variableX and a sequence of random variables {Xn} satisfy the following, it is said that Xnconverges in probability to X when n→∞, represented as Xn→PX.
∀ε>0,n→∞limP[∣Xn−X∣<ε]=1
The following expression, equivalent when used in formulas, is more frequently preferred.
∀ε>0,n→∞limP[∣Xn−X∣≥ε]=0
Chebyshev’s Inequality: If the variance σ2<∞ of a random variable X exists, then for μ:=E(X) and any positive number K>0,
P(∣X−μ∣≥Kσ)≤K21
Since the premise of the theorem includes the existence of skewness, implying that the fourth central momentE(X14)<∞ exists, the variance of ∑Xk2 can generally be represented as proportional to the population variance of X1’s sample variance by some constantc2>0 as c2σ4. Rewritten in formula, it is
n1k=1∑nXk2∼(E(X12),nc2σ4)
and for any given ε>0, according to Chebyshev’s inequality, some positive K:=nε/cσ2 exists such that
⟹⟹⟹∀ε>0,P(n1k=1∑nXk2−E(X12)≥Kcσ2)≤K21∀ε>0,P(n1k=1∑nXk2−E(X12)≥ε)≤n2ε2c2σ4∀ε>0,n→∞limP(n1k=1∑nXk2−E(X12)≥ε)=0n1k=1∑nXk2→PE(X12)
Summarizing,
Sn2=→Pn−1n[n1k=1∑nXk2−Xn2]1⋅[E(X12)−μ2]=σ2
and Sn2 is a consistent estimator for the population varianceσ2. Here, in the part about n/(n−1)→1, it can be understood that in fact, the sample variance could be divided by an appropriate constant a=n as (n+a) without problem as a consistent estimator.
■
Hogg et al. (2013). Introduction to Mathematical Statistcs(7th Edition): p298. ↩︎
Hogg et al. (2018). Introduction to Mathematical Statistcs(8th Edition): p325. ↩︎