Proximal Operator
📂OptimizationProximal Operator
Definition
Let ∥⋅∥X be the norm of the vector space X. The proximal operator proxλf:X→2X for a function f:X→R is defined as follows, for λ>0,
proxλf(x):=vargmin{λf(v)+21∥v−x∥X2:v∈X}=vargmin{f(v)+2λ1∥v−x∥X2:v∈X}
argmin represents the optimal solution.
Simplified Definition
If the above definition is difficult, it can be understood as follows. The proximal operator proxλf:Rn→Rn for a function f:Rn→R is defined as:
proxλf(x):=vargmin{λf(v)+21∥v−x∥22:v∈Rn}
Explanation
According to definition (1), proxλf(x) means compromising between minimizing the function value of f and getting closer to x. In other words, it can be said that it is a point that makes the value of f the smallest in the vicinity of x. It can be seen as adding a kind of regularization to the problem of optimizing f. Depending on the value of λ, it is determined which term is given more value in reduction. Point proxλf(x) is called the proximal point for f with respect to x.
Proximal means ‘close’ or ’nearby’, and proxλf(x) seems to be named proximal because it finds the optimal solution of f among the points close to x.
Functions f that are easy to calculate with proxλf(⋅) are called prox friendly.
In the case of X=Rn, if f is a convex function, then proxλf(x) is unique.
Algorithms that solve optimization problems using the proximal operator are called proximal algorithms. These include:
Properties
- Separable sum: If the function f is separable as f(u,v)=ϕ(u)+ψ(v), then the following holds:
proxλf(x,y)=(proxλϕ(x),proxλψ(y))
If it is completely separable for more than one variable as f(x)=∑infi(xi), then,
(proxλf(x))i=proxλfi(xi)
- Postcomposition: For f(x)=αϕ(x)+b and α>0, the following holds:
proxλf(x)=proxαλϕ(x)
- Precomposition: For f(x)=ϕ(αx+b) and α=0, the following holds:
proxλf(x)=α1(proxα2λϕ(αx+b)−b)
If f(x)=ϕ(Qx) and Q is orthogonal, then the following holds:
proxλf(x)=QTproxλϕ(Qx)
- Affine addition: For f(x)=ϕ(x)+aTx+b, the following holds:
proxλf(x)=proxλϕ(x−λa)
- Regularization: For f(x)=ϕ(x)+2ρ∥x−a∥22, the following holds:
proxλf(x)=proxλ~ϕ((λ~/λ)x+(ρλ~)a)
Here, λ~=λ/(1+λρ).