Subgradient
Definition 1 2
For a function $f : \mathbb{R}^{n} \to \mathbb{R}$, the $\mathbf{g} \in \mathbb{R}^{n}$ that satisfies the following is called a subgradient of $f$ at point $\mathbf{x}$.
$$ f(\mathbf{y}) \ge f(\mathbf{x}) + \mathbf{g}^{T}(\mathbf{y} - \mathbf{x}) \qquad \forall \mathbf{y} \in \mathbb{R}^{n} $$
Explanation
If the convex function $f$ is differentiable at $\mathbf{x}$, then $\mathbf{g} =$ $\nabla f(\mathbf{x})$ is unique. Conversely, if $\partial f(\mathbf{x}) = \left\{ \mathbf{g} \right\}$, then $f$ is differentiable at $\mathbf{x}$ and $\nabla f (\mathbf{x}) = \mathbf{g}$.
The set of subgradients of $f$ at $\mathbf{x}$ is called the subdifferential and is denoted as follows:
$$ \partial f(\mathbf{x}) := \left\{ \mathbf{g} : \mathbf{g} \text{ is subgradient of } f \text{ at } \mathbf{x} \right\} $$
If $\partial f(\mathbf{x}) \ne \varnothing$, then $f$ is said to be subdifferentiable at $\mathbf{x}$.
Properties
- Scaling: For $a \gt 0$, $\partial (af) = a \partial f$
- Addition: $\partial(f + g) = \partial f + \partial g$
- Affine composition: If $g(\mathbf{x}) = f(A \mathbf{x} + \mathbf{b})$, $$ \partial g(\mathbf{x}) = A^{T}\partial f(A \mathbf{x} + \mathbf{b}) $$
- Subgradient optimality condition: For any function $f$, $$ f(\mathbf{x}_{\ast}) = \min\limits_{\mathbf{x}} f(\mathbf{x}) \iff 0 \in \partial f(\mathbf{x}_{\ast}) $$
See Also
Rockafellar, R. Tyrrell. Convex analysis. Vol. 11. Princeton university press, 1997. p214~215 ↩︎
https://www.stat.cmu.edu/~ryantibs/convexopt-F16/lectures/subgrad.pdf ↩︎