logo

Beta Distribution 📂Probability Distribution

Beta Distribution

Definition 1

pdf0 pdf1 pdf2

For α,β>0\alpha , \beta > 0, the continuous probability distribution Beta(α,β)\text{Beta}(\alpha,\beta), called the beta Distribution, has the following probability density function: f(x)=1B(α,β)xα1(1x)β1,x[0,1] f(x) = {{ 1 } \over { B(\alpha,\beta) }} x^{\alpha - 1} (1-x)^{\beta - 1} \qquad , x \in [0,1]


Basic Properties

Moment Generating Function

  • [1]: m(t)=1+k=1(r=0k1α+rα+β+rtkk!),tRm(t) = 1 + \sum_{k=1}^{\infty} \left( \prod_{r=0}^{k-1} {{ \alpha + r } \over { \alpha + \beta + r }} {{ t^{k} } \over { k! }} \right) \qquad , t \in \mathbb{R}

Mean and Variance

  • [2]: If XBeta(α,β)X \sim \text{Beta}(\alpha,\beta), then E(X)=αα+βVar(X)=αβ(α+β+1)(α+β)2 \begin{align*} E(X) =& {\alpha \over {\alpha + \beta} } \\ \Var (X) =& { { \alpha \beta } \over {(\alpha + \beta + 1) { ( \alpha + \beta ) }^2 } } \end{align*}

Sufficient Statistics

  • [3]: Suppose a random sample X:=(X1,,Xn)Beta(α,β)\mathbf{X} := \left( X_{1} , \cdots , X_{n} \right) \sim \text{Beta} \left( \alpha, \beta \right) following a beta distribution is given.

The sufficient statistic TT for (α,β)\left( \alpha, \beta \right) is as follows: T=(iXi,i(1Xi)) T = \left( \prod_{i} X_{i}, \prod_{i} \left( 1 - X_{i} \right) \right)

Theorems

Derivation from Gamma Distribution

  • [a]: If two random variables X1,X2X_{1},X_{2} are independent and X1Γ(α1,1)X_{1} \sim \Gamma ( \alpha_{1} , 1), X2Γ(α2,1)X_{2} \sim \Gamma ( \alpha_{2} , 1) are given, then X1X1+X2beta(α1,α2) {{ X_{1} } \over { X_{1} + X_{2} }} \sim \text{beta} \left( \alpha_{1} , \alpha_{2} \right)

Derivation from F-distribution

  • [b]: For a random variable XF(r1,r2)X \sim F \left( r_{1}, r_{2} \right) following an F-distribution with degrees of freedom r1,r2r_{1} , r_{2}, the defined YY follows a beta distribution. Y:=(r1/r2)X1+(r1/r2)XBeta(r12,r22) Y := {{ \left( r_{1} / r_{2} \right) X } \over { 1 + \left( r_{1} / r_{2} \right) X }} \sim \text{Beta} \left( {{ r_{1} } \over { 2 }} , {{ r_{2} } \over { 2 }} \right)

Description

Just as the gamma distribution comes from the gamma function, the beta distribution is named after the beta function. The beta function has the following relationship with the gamma function, allowing it to be expressed via gamma functions. B(p,q)=Γ(p)Γ(q)Γ(p+q) B(p,q) = {{\Gamma (p) \Gamma (q)} \over {\Gamma (p+q) }} In fact, the gamma distribution can induce a beta distribution.

Just like the beta function can be seen as a generalization of binomial coefficients, careful observation of the beta distribution’s probability density function reveals its resemblance to the probability mass function of a binomial distribution P(k)=nCkpk(1p)nkP(k) = { _n {C} _k }{ p ^ k }{ (1-p) ^ { n - k } }. Although it doesn’t precisely match the definition of a beta distribution, if one considers α\alpha as the number of successes and β\beta as the number of failures, the resemblance is noticeable in: n=α+βp=αα+βq=βα+β n = \alpha + \beta \\ \displaystyle p = {{\alpha } \over {\alpha + \beta}} \\ \displaystyle q = {{\beta } \over {\alpha + \beta}} In fact, in Bayesian analysis, it is used as the conjugate prior distribution of a binomial distribution.

Proof

[1]

Though the equations are complex, there is no logical difficulty.

Exponential function series expansion: ex=n=0xnn! { { e ^ x } }=\sum _{ n=0 }^{ \infty }{ \frac { { x } ^{ n } }{ n! } }

Euler integration: B(p,q)=01tp1(1t)q1dt B(p,q)=\int_0^1 t^{p-1}(1-t)^{q-1}dt

m(t)=01etx1B(α,β)xα1(1x)β1dx=1B(α,β)01(k=0(tx)kk!)xα1(1x)β1dx=1B(α,β)k=0tkk!01xα+k1(1x)β1dx=1B(α,β)k=0tkk!B(α+k,β)=k=0tkk!B(α+k,β)B(α,β)=t00!B(α+0,β)B(α,β)+k=1tkk!B(α+k,β)B(α,β) \begin{align*} m(t) =& \int_{0}^{1} e^{tx} {{ 1 } \over { B(\alpha,\beta) }} x^{\alpha - 1} (1-x)^{\beta - 1} dx \\ =& {{ 1 } \over { B(\alpha,\beta) }} \int_{0}^{1} \left( \sum_{k=0}^{\infty} {{ (tx)^{k} } \over { k! }} \right) x^{\alpha - 1} (1-x)^{\beta - 1} dx \\ =& {{ 1 } \over { B(\alpha,\beta) }} \sum_{k=0}^{\infty} {{ t^{k} } \over { k! }} \int_{0}^{1} x^{\alpha + k - 1} (1-x)^{\beta - 1} dx \\ =& {{ 1 } \over { B(\alpha,\beta) }} \sum_{k=0}^{\infty} {{ t^{k} } \over { k! }} B \left( \alpha + k , \beta \right) \\ =& \sum_{k=0}^{\infty} {{ t^{k} } \over { k! }} {{ B \left( \alpha + k , \beta \right) } \over { B(\alpha,\beta) }} \\ =& {{ t^{0} } \over { 0! }} {{ B \left( \alpha + 0 , \beta \right) } \over { B(\alpha,\beta) }} + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ B \left( \alpha + k , \beta \right) } \over { B(\alpha,\beta) }} \end{align*}

Relationship between Beta function and Gamma function: B(p,q)=Γ(p)Γ(q)Γ(p+q)B(p,q) = {{\Gamma (p) \Gamma (q)} \over {\Gamma (p+q) }}

Expanding the Beta function into Gamma functions results in:

m(t)=1+k=1tkk!B(α+k,β)B(α,β)=1+k=1tkk!Γ(α+k)Γ(β)Γ(α+β+k)Γ(α+β)Γ(α)Γ(β)=1+k=1tkk!Γ(α+k)Γ(α+β+k)Γ(α+β)Γ(α)=1+k=1tkk!Γ(α+k)Γ(α)Γ(α+β)Γ(α+β+k)=1+k=1tkk!Γ(α)r=0k1(α+r)Γ(α)Γ(α+β)Γ(α+β)r=0k1(α+β+r)=1+k=1tkk!r=0k1α+rα+β+r \begin{align*} m(t) =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ B \left( \alpha + k , \beta \right) } \over { B(\alpha,\beta) }} \\ =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ \Gamma ( \alpha + k ) \Gamma ( \beta ) } \over { \Gamma \left( \alpha + \beta + k \right) }} {{ \Gamma ( \alpha + \beta ) } \over { \Gamma \left( \alpha \right) \Gamma \left( \beta \right) }} \\ =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ \Gamma ( \alpha + k ) } \over { \Gamma \left( \alpha + \beta + k \right) }} {{ \Gamma ( \alpha + \beta ) } \over { \Gamma \left( \alpha \right) }} \\ =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ \Gamma ( \alpha + k ) } \over { \Gamma \left( \alpha \right) }} {{ \Gamma ( \alpha + \beta ) } \over { \Gamma \left( \alpha + \beta + k \right) }} \\ =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} {{ \Gamma ( \alpha ) \prod_{r=0}^{k-1} ( \alpha + r) } \over { \Gamma \left( \alpha \right) }} {{ \Gamma ( \alpha + \beta ) } \over { \Gamma \left( \alpha + \beta \right) \prod_{r=0}^{k-1} ( \alpha + \beta + r) }} \\ =& 1 + \sum_{k=1}^{\infty} {{ t^{k} } \over { k! }} \prod_{r=0}^{k-1} {{ \alpha + r } \over { \alpha + \beta + r }} \end{align*}

[2]

Derive directly.

[3]

Though (1x)(1 - x) may make one uncomfortable, just derive directly.

[a]

Derive directly using the probability density function.

[b]

Derive directly using the probability density function.

Code

Below is the Julia code that displays the probability density function of the beta distribution as a GIF.

@time using LaTeXStrings
@time using Distributions
@time using Plots

cd(@__DIR__)

x = 0:0.01:1
B = collect(0.1:0.1:10.0); append!(B, reverse(B))

animation = @animate for β ∈ B
    plot(x, pdf.(Beta(0.5, β), x),
     color = :black,
     label = "α = 0.5, β = $(rpad(β, 3, '0'))", size = (400,300))
    xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (0.5, \beta)")
end
gif(animation, "pdf0.gif")

animation = @animate for β ∈ B
    plot(x, pdf.(Beta(1, β), x),
     color = :black,
     label = "α = 1, β = $(rpad(β, 3, '0'))", size = (400,300))
    xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (1, \beta)")
end
gif(animation, "pdf1.gif")

animation = @animate for β ∈ B
    plot(x, pdf.(Beta(2, β), x),
     color = :black,
     label = "α = 2, β = $(rpad(β, 3, '0'))", size = (400,300))
    xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (2, \beta)")
end
gif(animation, "pdf2.gif")

  1. Hogg et al. (2013). Introduction to Mathematical Statistics(7th Edition): p165. ↩︎