logo

Proof of the Finite Form of Jensen's Inequality 📂Lemmas

Proof of the Finite Form of Jensen's Inequality

Theorem

In $I \subset \mathbb{R}$, for the convex functions $f : I \to \mathbb{R}$ and $\displaystyle \sum_{k=1}^{n} \lambda_{k} = 1, \lambda_{k}>0$

$$ \begin{align*} f( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{n} x_{n} ) & \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{n} f( x_{n} ) \\ f\left( \sum\limits_{k=1}^{n}\lambda_{k}x_{k} \right) &\le \sum\limits_{k=1}^{n} \lambda_{k} f(x_{k}) \end{align*} $$

If $f$ is a concave function, then the opposite inequality holds.

$$ \begin{align*} f( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{n} x_{n} ) & \ge \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{n} f( x_{n} ) \\ f\left( \sum\limits_{k=1}^{n}\lambda_{k}x_{k} \right) &\ge \sum\limits_{k=1}^{n} \lambda_{k} f(x_{k}) \end{align*} $$

Description

Jensen’s Inequality is widely used across various fields as a representative application of convex functions. Its finite form is a generalization concerning the number of points and weights from the original definition of convexity.

Proof

The proof method is the same, so we will only prove for convex functions.

Strategy: Apply mathematical induction by inserting terms inside and outside the function according to the definition of convex functions.

Definition of convex functions: For any two elements $x_{1} , x_{2}$ in the interval $I \subset \mathbb{R}$ and functions $f : I \to \mathbb{R}$ and $0 \le t \le 1$, if $f( t x_{1} + (1-t) x_{2}) \le t f(x_{1}) + (1-t) f(x_{2})$, then $f$ is called a convex function at $I$.


When $n=1$, $f$ is convex, therefore

$$ f( \lambda x ) \le \lambda f(x) $$

When $n=2$, $f$ is convex, therefore

$$ f( \lambda_{1} x_{1} + \lambda_{2} x_{2} ) \le \lambda_{1} f( x_{1} ) + \lambda_{2} f( x_{2}) $$

Assuming this holds for $n = N \ge 3$,

$$ f( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} ) \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N} f( x_{N} ) $$

then proving it also holds for $n = N + 1$ completes the proof.

Defining $\displaystyle \mu_{i} := {{\lambda_{i}} \over {1 - \lambda_{N+1}}}$ in $\displaystyle \sum_{k=1}^{N+1} \lambda_{k} = 1$ gives $\displaystyle \sum_{k=1}^{N} \mu_{k} = 1$, therefore

$$ f( \mu_{1} x_{1} + \mu_{2} x_{2} + \cdots + \mu_{N} x_{N} ) \le \mu_{1} f( x_{1}) + \mu_{2} f( x_{2}) + \cdots + \mu_{N} f( x_{N} ) $$

Following the definition of $\mu_{i}$,

$$ f( {{\lambda_{1}} \over {1 - \lambda_{N+1}}} x_{1} + {{\lambda_{2}} \over {1 - \lambda_{N+1}}} x_{2} + \cdots + {{\lambda_{N}} \over {1 - \lambda_{N+1}}} x_{N} ) \\ \le {{\lambda_{1}} \over {1 - \lambda_{N+1}}} f( x_{1}) + {{\lambda_{2}} \over {1 - \lambda_{N+1}}} f( x_{2}) + \cdots + {{\lambda_{N}} \over {1 - \lambda_{N+1}}} f( x_{N} ) $$

grouping them as $\displaystyle {{1} \over {1 - \lambda_{N+1}}}$,

$$ f \left( {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) \right) \\ \le {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N} f( x_{N} ) \right) $$

Multiplying both sides by $1 - \lambda_{N+1}$,

$$ (1 - \lambda_{N+1}) f \left( {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) \right) \\ \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N} f( x_{N} ) $$

Adding $\lambda_{N+1} f( x_{N+1} )$ to both sides,

$$ (1 - \lambda_{N+1}) f \left( {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) \right) + \lambda_{N+1} f (x_{N+1}) \\ \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N} f( x_{N} ) + \lambda_{N+1} f( x_{N+1} ) $$

Since $( 1 - \lambda_{N+1} ) + \lambda_{N+1} = 1$ and $f$ is a convex function,

$$ f \left( {{1 - \lambda_{N+1}} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) + \lambda_{N+1} x_{N+1} \right) \\ \le (1 - \lambda_{N+1}) f \left( {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) \right) + \lambda_{N+1} f (x_{N+1}) $$

Continuing the inequality,

$$ f \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} + \lambda_{N+1} x_{N+1} \right) \\ \le (1 - \lambda_{N+1}) f \left( {{1} \over {1 - \lambda_{N+1}}} \left( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N} x_{N} \right) \right) + \lambda_{N+1} f (x_{N+1}) \\ \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N} f( x_{N} ) + \lambda_{N+1} f( x_{N+1} ) $$

In conclusion,

$$ f( \lambda_{1} x_{1} + \lambda_{2} x_{2} + \cdots + \lambda_{N+1} x_{N+1} ) \le \lambda_{1} f( x_{1}) + \lambda_{2} f( x_{2}) + \cdots + \lambda_{N+1} f( x_{N+1} ) $$

See Also