Proof of the Finite Form of Jensen's Inequality
📂LemmasProof of the Finite Form of Jensen's Inequality
Theorem
In I⊂R, for the convex functions f:I→R and k=1∑nλk=1,λk>0
f(λ1x1+λ2x2+⋯+λnxn)f(k=1∑nλkxk)≤λ1f(x1)+λ2f(x2)+⋯+λnf(xn)≤k=1∑nλkf(xk)
If f is a concave function, then the opposite inequality holds.
f(λ1x1+λ2x2+⋯+λnxn)f(k=1∑nλkxk)≥λ1f(x1)+λ2f(x2)+⋯+λnf(xn)≥k=1∑nλkf(xk)
Description
Jensen’s Inequality is widely used across various fields as a representative application of convex functions. Its finite form is a generalization concerning the number of points and weights from the original definition of convexity.
Proof
The proof method is the same, so we will only prove for convex functions.
Strategy: Apply mathematical induction by inserting terms inside and outside the function according to the definition of convex functions.
Definition of convex functions: For any two elements x1,x2 in the interval I⊂R and functions f:I→R and 0≤t≤1, if f(tx1+(1−t)x2)≤tf(x1)+(1−t)f(x2), then f is called a convex function at I.
When n=1, f is convex, therefore
f(λx)≤λf(x)
When n=2, f is convex, therefore
f(λ1x1+λ2x2)≤λ1f(x1)+λ2f(x2)
Assuming this holds for n=N≥3,
f(λ1x1+λ2x2+⋯+λNxN)≤λ1f(x1)+λ2f(x2)+⋯+λNf(xN)
then proving it also holds for n=N+1 completes the proof.
Defining μi:=1−λN+1λi in k=1∑N+1λk=1 gives k=1∑Nμk=1, therefore
f(μ1x1+μ2x2+⋯+μNxN)≤μ1f(x1)+μ2f(x2)+⋯+μNf(xN)
Following the definition of μi,
f(1−λN+1λ1x1+1−λN+1λ2x2+⋯+1−λN+1λNxN)≤1−λN+1λ1f(x1)+1−λN+1λ2f(x2)+⋯+1−λN+1λNf(xN)
grouping them as 1−λN+11,
f(1−λN+11(λ1x1+λ2x2+⋯+λNxN))≤1−λN+11(λ1f(x1)+λ2f(x2)+⋯+λNf(xN))
Multiplying both sides by 1−λN+1,
(1−λN+1)f(1−λN+11(λ1x1+λ2x2+⋯+λNxN))≤λ1f(x1)+λ2f(x2)+⋯+λNf(xN)
Adding λN+1f(xN+1) to both sides,
(1−λN+1)f(1−λN+11(λ1x1+λ2x2+⋯+λNxN))+λN+1f(xN+1)≤λ1f(x1)+λ2f(x2)+⋯+λNf(xN)+λN+1f(xN+1)
Since (1−λN+1)+λN+1=1 and f is a convex function,
f(1−λN+11−λN+1(λ1x1+λ2x2+⋯+λNxN)+λN+1xN+1)≤(1−λN+1)f(1−λN+11(λ1x1+λ2x2+⋯+λNxN))+λN+1f(xN+1)
Continuing the inequality,
f(λ1x1+λ2x2+⋯+λNxN+λN+1xN+1)≤(1−λN+1)f(1−λN+11(λ1x1+λ2x2+⋯+λNxN))+λN+1f(xN+1)≤λ1f(x1)+λ2f(x2)+⋯+λNf(xN)+λN+1f(xN+1)
In conclusion,
f(λ1x1+λ2x2+⋯+λN+1xN+1)≤λ1f(x1)+λ2f(x2)+⋯+λN+1f(xN+1)
■
See Also