logo

Subgradient 📂Optimization

Subgradient

Definition 1 2

For a function f:RnRf : \mathbb{R}^{n} \to \mathbb{R}, the gRn\mathbf{g} \in \mathbb{R}^{n} that satisfies the following is called a subgradient of ff at point x\mathbf{x}.

f(y)f(x)+gT(yx)yRn f(\mathbf{y}) \ge f(\mathbf{x}) + \mathbf{g}^{T}(\mathbf{y} - \mathbf{x}) \qquad \forall \mathbf{y} \in \mathbb{R}^{n}

Explanation

If the convex function ff is differentiable at x\mathbf{x}, then g=\mathbf{g} = f(x)\nabla f(\mathbf{x}) is unique. Conversely, if f(x)={g}\partial f(\mathbf{x}) = \left\{ \mathbf{g} \right\}, then ff is differentiable at x\mathbf{x} and f(x)=g\nabla f (\mathbf{x}) = \mathbf{g}.

The set of subgradients of ff at x\mathbf{x} is called the subdifferential and is denoted as follows:

f(x):={g:g is subgradient of f at x} \partial f(\mathbf{x}) := \left\{ \mathbf{g} : \mathbf{g} \text{ is subgradient of } f \text{ at } \mathbf{x} \right\}

If f(x)\partial f(\mathbf{x}) \ne \varnothing, then ff is said to be subdifferentiable at x\mathbf{x}.

Properties

  • Scaling: For a>0a \gt 0, (af)=af\partial (af) = a \partial f
  • Addition: (f+g)=f+g\partial(f + g) = \partial f + \partial g
  • Affine composition: If g(x)=f(Ax+b)g(\mathbf{x}) = f(A \mathbf{x} + \mathbf{b}), g(x)=ATf(Ax+b) \partial g(\mathbf{x}) = A^{T}\partial f(A \mathbf{x} + \mathbf{b})
  • Subgradient optimality condition: For any function ff, f(x)=minxf(x)    0f(x) f(\mathbf{x}_{\ast}) = \min\limits_{\mathbf{x}} f(\mathbf{x}) \iff 0 \in \partial f(\mathbf{x}_{\ast})

See Also


  1. Rockafellar, R. Tyrrell. Convex analysis. Vol. 11. Princeton university press, 1997. p214~215 ↩︎

  2. https://www.stat.cmu.edu/~ryantibs/convexopt-F16/lectures/subgrad.pdf ↩︎