logo

Definition of a General Convex Function 📂Functions

Definition of a General Convex Function

Definitions

A function f:VRf : V \to \mathbb{R} defined in a vector space VV is called a convex function if it satisfies the following for all x,yV\mathbf{x}, \mathbf{y} \in V and all t[0,1]t \in [0,1]. f((1t)x+ty)(1t)f(x)+tf(y) f \left( (1-t) \mathbf{x} + t \mathbf{y} \right) \le (1-t) f \left( \mathbf{x} \right) + t f \left( \mathbf{y} \right)

Explanation

In fact, once you study mathematics beyond the undergraduate level, you stop using terms like concave or convex, except the term convex is used almost exclusively. The convex function introduced above is a generalization of a convex function defined in Euclidean space X=RX = \mathbb{R}.

Example

The set of roots forming a parabola f(x,y):=x2+y2f(x,y) := x^{2} + y^{2} is a convex function, and the point where its value is minimized is (x,y)=(0,0)(x,y) = (0,0).

Theorem

Optimal Solution of a Convex Function in a Convex Set

If f:RnRf : \mathbb{R}^{n} \to \mathbb{R} is a differentiable convex function, and the convex set CRnC \subset \mathbb{R}^{n} is a convex set, then the following two are equivalent:

  • xC\mathbf{x}^{\ast} \in C is the optimal solution that minimizes ff in CC.
  • For all xC\mathbf{x} \in C f(x)(xx)0 \nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \ge 0

Proof 1

(    )(\implies)

x(t):=x+t(xx)C,t[0,1] \mathbf{x}(t) := \mathbf{x}^{\ast} + t \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \in C \qquad , t \in [0,1] Since CC is assumed to be a convex set, any convex combination x(t)\mathbf{x}(t) of xC\mathbf{x} \in C and x\mathbf{x}^{\ast} can be represented as above. Meanwhile,

  • Since x=arg minxCf\mathbf{x}^{\ast} = \argmin_{\mathbf{x} \in C} f, for all xC\mathbf{x} \in C f(x(t))f(x)0f \left( \mathbf{x}(t) \right) - f \left( \mathbf{x}^{\ast} \right) \ge 0, and
  • The partial derivative at t=0t = 0 of f(x(t))f \left( \mathbf{x}(t) \right) according to the chain rule is f(x)(xx)\nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right),

f(x)(xx)=[tf(x(t))]t=0=limt0f(x(t))f(x)t0 \begin{align*} & \nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \\ =& \left[ {{ \partial } \over { \partial t }} f \left( \mathbf{x} (t) \right) \right]_{t = 0} \\ =& \lim_{t \to 0} {{ f \left( \mathbf{x} (t) \right) - f \left( \mathbf{x}^{\ast} \right) } \over { t }} \\ \ge & 0 \end{align*} which results in f(x)(xx)0\nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \ge 0.


(    )(\impliedby)

f((1t)z+ty)(1t)f(z)+tf(y) f \left( (1-t) \mathbf{z} + t \mathbf{y} \right) \le (1-t) f \left( \mathbf{z} \right) + t f \left( \mathbf{y} \right) Differentiating both sides with respect to tt yields (z+y)f((1t)z+ty)f(z)+f(y) \left( - \mathbf{z} + \mathbf{y} \right) \nabla f \left( (1-t) \mathbf{z} + t \mathbf{y} \right) \le - f \left( \mathbf{z} \right) + f \left( \mathbf{y} \right) thus, when t=0t = 0, the following is obtained: f(z)+f(z)(yz)f(y) f \left( \mathbf{z} \right) + \nabla f \left( \mathbf{z} \right) \left( \mathbf{y} - \mathbf{z} \right) \le f \left( \mathbf{y} \right) From the assumption, since f(x)(xx)0\nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \ge 0, the following is obtained for y=x\mathbf{y} = \mathbf{x} and z=x\mathbf{z} = \mathbf{x}^{\ast}: f(x)+0f(x)+f(x)(xx)f(x) \begin{align*} & f \left( \mathbf{x}^{\ast} \right) + 0 \\ \le & f \left( \mathbf{x}^{\ast} \right) + \nabla f \left( \mathbf{x}^{\ast} \right) \left( \mathbf{x} - \mathbf{x}^{\ast} \right) \\ \le & f \left( \mathbf{x} \right) \end{align*} This means f(x)f(x)f \left( \mathbf{x}^{\ast} \right) \le f \left( \mathbf{x} \right), which holds for all xC\mathbf{x} \in C, so x\mathbf{x}^{\ast} becomes the minimal solution of ff in CC.


  1. Matousek. (2007). Understanding and Using Linear Programming: p186. ↩︎