Beta Distribution
📂Probability DistributionBeta Distribution
Definition

For α,β>0, the continuous probability distribution Beta(α,β), called the beta Distribution, has the following probability density function:
f(x)=B(α,β)1xα−1(1−x)β−1,x∈[0,1]
Basic Properties
Moment Generating Function
- [1]: m(t)=1+k=1∑∞(r=0∏k−1α+β+rα+rk!tk),t∈R
- [2]: If X∼Beta(α,β), then
E(X)=Var(X)=α+βα(α+β+1)(α+β)2αβ
- [3]: Suppose a random sample X:=(X1,⋯,Xn)∼Beta(α,β) following a beta distribution is given.
The sufficient statistic T for (α,β) is as follows:
T=(i∏Xi,i∏(1−Xi))
Theorems
- [a]: If two random variables X1,X2 are independent and X1∼Γ(α1,1), X2∼Γ(α2,1) are given, then
X1+X2X1∼beta(α1,α2)
- [b]: For a random variable X∼F(r1,r2) following an F-distribution with degrees of freedom r1,r2, the defined Y follows a beta distribution.
Y:=1+(r1/r2)X(r1/r2)X∼Beta(2r1,2r2)
Description
Just as the gamma distribution comes from the gamma function, the beta distribution is named after the beta function. The beta function has the following relationship with the gamma function, allowing it to be expressed via gamma functions.
B(p,q)=Γ(p+q)Γ(p)Γ(q)
In fact, the gamma distribution can induce a beta distribution.
Just like the beta function can be seen as a generalization of binomial coefficients, careful observation of the beta distribution’s probability density function reveals its resemblance to the probability mass function of a binomial distribution P(k)=nCkpk(1−p)n−k. Although it doesn’t precisely match the definition of a beta distribution, if one considers α as the number of successes and β as the number of failures, the resemblance is noticeable in:
n=α+βp=α+βαq=α+ββ
In fact, in Bayesian analysis, it is used as the conjugate prior distribution of a binomial distribution.
Proof
[1]
Though the equations are complex, there is no logical difficulty.
Exponential function series expansion:
ex=n=0∑∞n!xn
Euler integration:
B(p,q)=∫01tp−1(1−t)q−1dt
m(t)======∫01etxB(α,β)1xα−1(1−x)β−1dxB(α,β)1∫01(k=0∑∞k!(tx)k)xα−1(1−x)β−1dxB(α,β)1k=0∑∞k!tk∫01xα+k−1(1−x)β−1dxB(α,β)1k=0∑∞k!tkB(α+k,β)k=0∑∞k!tkB(α,β)B(α+k,β)0!t0B(α,β)B(α+0,β)+k=1∑∞k!tkB(α,β)B(α+k,β)
Relationship between Beta function and Gamma function: B(p,q)=Γ(p+q)Γ(p)Γ(q)
Expanding the Beta function into Gamma functions results in:
m(t)======1+k=1∑∞k!tkB(α,β)B(α+k,β)1+k=1∑∞k!tkΓ(α+β+k)Γ(α+k)Γ(β)Γ(α)Γ(β)Γ(α+β)1+k=1∑∞k!tkΓ(α+β+k)Γ(α+k)Γ(α)Γ(α+β)1+k=1∑∞k!tkΓ(α)Γ(α+k)Γ(α+β+k)Γ(α+β)1+k=1∑∞k!tkΓ(α)Γ(α)∏r=0k−1(α+r)Γ(α+β)∏r=0k−1(α+β+r)Γ(α+β)1+k=1∑∞k!tkr=0∏k−1α+β+rα+r
■
[2]
Derive directly.
■
[3]
Though (1−x) may make one uncomfortable, just derive directly.
[a]
Derive directly using the probability density function.
■
[b]
Derive directly using the probability density function.
■
Code
Below is the Julia code that displays the probability density function of the beta distribution as a GIF.
@time using LaTeXStrings
@time using Distributions
@time using Plots
cd(@__DIR__)
x = 0:0.01:1
B = collect(0.1:0.1:10.0); append!(B, reverse(B))
animation = @animate for β ∈ B
plot(x, pdf.(Beta(0.5, β), x),
color = :black,
label = "α = 0.5, β = $(rpad(β, 3, '0'))", size = (400,300))
xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (0.5, \beta)")
end
gif(animation, "pdf0.gif")
animation = @animate for β ∈ B
plot(x, pdf.(Beta(1, β), x),
color = :black,
label = "α = 1, β = $(rpad(β, 3, '0'))", size = (400,300))
xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (1, \beta)")
end
gif(animation, "pdf1.gif")
animation = @animate for β ∈ B
plot(x, pdf.(Beta(2, β), x),
color = :black,
label = "α = 2, β = $(rpad(β, 3, '0'))", size = (400,300))
xlims!(0,1); ylims!(0,5); title!(L"\mathrm{pmf\,of\,Beta} (2, \beta)")
end
gif(animation, "pdf2.gif")