Differences Between the Two Definitions of the Geometric Distribution 📂Probability Distribution

Differences Between the Two Definitions of the Geometric Distribution

Description

While studying the geometric distribution, the most perplexing and confusing aspect is the differing explanations across textbooks, blogs, and wikis. Some sources mention the mean as $\displaystyle {{1} \over {p}} $, while others use $\displaystyle {{1-p} \over {p}}$.

This discrepancy arises because there are two ways to define the geometric distribution. The probability mass function of the geometric distribution $\text{Geo}(p)$ is defined either as $$ p_{1}(x) = p(1-p)^{x-1} , x= 1,2,3,\cdots $$ or $$ p_{2}(x) = p(1-p)^{x} , x= 0,1,2,\cdots $$. The expectation is determined by the probability mass function, using $p_{1}$ results in $\displaystyle {{1} \over {p}}$, and using $p_{2}$ results in $\displaystyle {{1-p} \over {p}}$.

Upon examining the probability mass functions, one can see there is no fundamental difference between the two definitions; it merely depends on whether one starts counting from $1$ or from $0$. Considering the intuitive definition of the geometric distribution, whether one is interested in the number of trials until a ‘success’ occurs, or the number of failures before the first success boils down to two perspectives. If success occurs on the first trial, the number of trials would be $1$, and the number of failures would be $0$.

Moreover, since the geometric distribution possesses the memoryless property, it can be utilized in survival analysis. If an event is considered a ‘failure’, one might be interested in how many times it withstands failure before occurring. In such cases, it makes sense to count the number of ‘failures’.

Ultimately, choosing a particular probability mass function depends on the subject of interest, convenience, and convention. It’s best not to overthink it and simply use whichever you prefer.