logo

Geometric Distribution 📂Probability Distribution

Geometric Distribution

Definition 1

pmf.gif

For p(0,1]p \in (0,1], the discrete probability distribution Geo(p)\text{Geo}(p) that follows the probability mass function as shown above, is called the Geometric Distribution. p(x)=p(1p)x1,x=1,2,3, p(x) = p (1 - p)^{x-1} \qquad , x = 1 , 2, 3, \cdots


  • Take special care with the domain and the formula as there are two definitions used.

Basic Properties

Moment Generating Function

  • [1]: m(t)=pet1(1p)et,t<log(1p)m(t) = {{ p e^{t} } \over { 1 - (1-p) e^{t} }} \qquad , t < -\log (1-p)

Mean and Variance

  • [2]: If XGeo(p)X \sim \text{Geo} (p) then E(X)=1pVar(X)=1pp2 \begin{align*} E(X) =& {{ 1 } \over { p }} \\ \Var(X) =& {{ 1-p } \over { p^{2} }} \end{align*}

Sufficient Statistic and Maximum Likelihood Estimator

  • [3]: Suppose a random sample X:=(X1,,Xn)Geo(p)\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim \text{Geo} \left( p \right) is given. The sufficient statistic TT and maximum likelihood estimator p^\hat{p} for pp are as follows. T=k=1nXkp^=nk=1nXk \begin{align*} T =& \sum_{k=1}^{n} X_{k} \\ \hat{p} =& {{ n } \over { \sum_{k=1}^{n} X_{k} }} \end{align*}

Theorems

Memorylessness

  • [a]: If XGeo(p)X \sim \text{Geo} (p) then P(Xs+t,,Xs)=P(Xt) P(X \ge s+ t ,|, X \ge s) = P(X \ge t)

Generalization to Geometric Distribution

  • [b]: If Y=X1++XrY = X_{1} + \cdots + X_{r} and XiiidGeo(p)X_{i} \overset{\text{iid}}{\sim} \text{Geo}(p) then YNB(r,p)Y \sim \text{NB}(r,p)

Explanation

Relation with Exponential Distribution

The geometric distribution is interested in how many trials it takes to achieve success with probability 0<p10 < p \le 1. Its probability mass function represents the probability of failing x1x-1 times with probability (1p)(1-p) before finally succeeding with probability pp. This characteristic allows it to be seen as the discretization of exponential distribution.

Naming

The distribution is called the geometric distribution because its probability mass function has the form of a geometric sequence. If we set a:=pa := p, r:=(1p)r := (1-p), we get a familiar formula with p(x)=arx1p(x) = a r ^{x-1}. Indeed, when computing the moment-generating function, the formula for a geometric series appears.

Proof

[1]

M(t)=x=1etxp(x)=x=1etxp(1p)x1=petx=1[et(1p)]x1 \begin{align*} M(t) =& \sum_{x=1}^{\infty} e^{tx} p(x) \\ =& \sum_{x=1}^{\infty} e^{tx} p (1-p)^{x-1} \\ =& p e^{t} \sum_{x=1}^{\infty} \left[ e^{t}(1-p) \right]^{x-1} \end{align*} When t<log(1p) t < -\log (1-p), according to the formula for a geometric series, petx=1[et(1p)]x1=pet1(1p)et p e^{t} \sum_{x=1}^{\infty} \left[ e^{t}(1-p) \right]^{x-1} = {{ p e^{t} } \over { 1 - (1-p) e^{t} }}

[2]

There are two methods.

[3]

Direct deduction.

[a]

Deduced using conditional probability.

[b]

Deduced using the moment generating function.

Code

Below is a Julia code that shows the probability mass function of the geometric distribution as a gif.

@time using LaTeXStrings
@time using Distributions
@time using Plots

cd(@__DIR__)

x = 0:20
P = collect(0.01:0.01:0.5); append!(P, reverse(P))

animation = @animate for p ∈ P
    scatter(x, pdf.(Geometric(p), x),
     color = :black, markerstrokecolor = :black,
     label = "p = $(rpad(p, 4, '0'))", size = (400,300))
    xlims!(0,20); ylims!(0,0.3); title!(L"\mathrm{pmf\,of\,Geo}(p)")
end
gif(animation, "pmf.gif")

  1. Hogg et al. (2013). Introduction to Mathematical Statistics (7th Edition): p145. ↩︎