logo

Mean and Variance of the Binomial Distribution 📂Probability Distribution

Mean and Variance of the Binomial Distribution

Formulas

XBin(n,p)\displaystyle X \sim \text{Bin} (n,p) Surface E(X)=npVar(X)=npq E(X)=np \\ \Var(X)=npq


  • Where q:=1pq : = 1-p stands.

Derivation

Strategy: Directly unravel the combinations. The expression might be quite messy, but it’s entirely digestible at the high school level. It’s worth trying out at least once. Upon encountering mathematical statistics, one can prove it using a much shorter and simpler method. Be it the mean or variance, start with the following probability mass function of the binomial distribution.

Definition of Binomial Distribution: A discrete probability distribution Bin(n,p)\text{Bin}(n,p) with the following probability mass function for nNn \in \mathbb{N} and p[0,1]p \in [0,1] is called the binomial distribution. p(x)=(nx)px(1p)nx,x=0,1,,n p(x) = \binom{n}{x} p^{x} (1-p)^{n-x} \qquad , x= 0, 1 , \cdots , n

Mean

Since the probability mass function of the binomial distribution Bin(n,p)\text{Bin} (n,p) is p(k)=nCkpk(1p)nkp( k ) = { _n {C} _k } p^{ k } (1-p)^{n-k}, E(X)=k=0nknCkpkqnk E(X)=\sum _{ k=0 }^{ n }{ k{ _n {C} _k }{ p ^ k }{ q ^ { n - k } } } when k=0\displaystyle k=0, then knCkpkqnk=0k{ _n {C} _k }{ p ^ k }{ q ^ { n - k } }=0 so E(X)=k=1nknCkpkqnk=k=1nkn!(nk)!k!pkqnk=npk=1n(n1)!(nk)!(k1)!pk1qnk \begin{align*} E(X) =& \sum _{ k=1 }^{ n }{ k{ _n {C} _k }{ p ^ k }{ q ^ { n - k } } } \\ =& \sum _{ k=1 }^{ n }{ k\frac { n! }{ (n-k)!k! }{ p ^ k }{ q ^ { n - k } } } \\ =& np\sum _{ k=1 }^{ n }{ \frac { (n-1)! }{ (n-k)!(k-1)! }{ p ^ { k - 1 } }{ q ^ { n - k } } } \end{align*} If (n1)=m,(k1)=s(n-1)=m, (k-1)=s is defined as, E(X)=nps=0mm!(ms)!s!psqms=np1=np \begin{align*} E(X) =& np\sum _{ s=0 }^{ m }{ \frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } } \\ =& np\cdot 1 \\ =& np \end{align*}

Variance

From the properties of variance, Var(X)=E(X2)E(X)2=E(X2)(np)2\Var (X)=E({ X ^ 2 })- { {E(X)} }^{ 2 } = E(X^{ 2 })- { (np) }^{ 2 } E(X2)=k=1nk2n!(nk)!k!pkqnk=npk=1nk(n1)!(nk)!(k1)!pk1qnk \begin{align*} E({ X ^ 2 }) =& \sum _{ k=1 }^{ n }{ { k }^{ 2 } \frac { n! }{ (n-k)!k! }{ p ^ k }{ q ^ { n - k } } } \\ =& np\sum _{ k=1 }^{ n }{ k\frac { (n-1)! }{ (n-k)!(k-1)! }{ p ^ { k - 1 } }{ q ^ { n - k } } } \end{align*} If defining (n1)=m,(k1)=s\displaystyle (n-1)=m, (k-1)=s as, E(X2)=nps=0m(s+1)m!(ms)!s!psqms=np(s=0msm!(ms)!s!psqms+s=0mm!(ms)!s!psqms)=np(s=0msm!(ms)!s!psqms+1) \begin{align*} E({ X ^ 2 }) =& np\sum _{ s=0 }^{ m }{ (s+1)\frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } } \\ =& np\left( \sum _{ s=0 }^{ m }{ s\frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } }+\sum _{ s=0 }^{ m }{ \frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } } \right) \\ =& np\left( \sum _{ s=0 }^{ m }{ s\frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } }+ 1 \right) \end{align*} The expected value of SBin(m,p)S \sim \text{Bin} (m,p) is s=0msm!(ms)!s!psqms=mp\displaystyle \sum _{ s=0 }^{ m }{ s\frac { m! }{ (m-s)!s! }{ p ^ s }{ q ^ { m - s } } }=mp, therefore E(X2)=np(mp+1)=np(n1)p+1=np(npp+1)=np(np+q)=(np)2+npq \begin{align*} E({ X ^ 2 }) =& np(mp+1) \\ =& np{(n-1)p+1} \\ =& np(np-p+1) \\ =& np(np+q) \\ =& { (np) ^ 2 }+npq \end{align*} Thus, Var(X)=E(X2)(np)2=(np)2+npq(np)2=npq \begin{align*} \Var (X) =& E(X^{ 2 })-{ (np) ^ 2 } \\ =& { (np) ^ 2 }+npq-{ (np) ^ 2 } \\ =& npq \end{align*}