Second Order Necessary/Sufficient Conditions for the Extreme Values of Multivariable Functions
📂OptimizationSecond Order Necessary/Sufficient Conditions for the Extreme Values of Multivariable Functions
Theorem
Let’s say the function f:Rn→R is given. ∇f, ∇2f are the gradient and Hessian of f, respectively.
Second-order necessary conditions
If x∗ is a local optimizer and ∇2f exists and is continuous in the neighborhood of x∗,
∇f(x∗)=0
and ∇2f(x∗) is positive semidefinite.
Note that 0 is not the number zero, but the zero vector.
Second-order sufficient conditions
In the neighborhood of x∗, if ∇2f is continuous, ∇f(x∗)=0, and ∇f(x∗) is positive definite, then x∗ is a strict local optimizer.
Explanation
The necessary condition talks about the properties of f when x∗ is an extremum. The sufficient condition discusses whether x∗ is an extremum based on certain properties of f.
Proof
Second-order necessary conditions
It is true that ∇f(x∗) holds by the first-order necessary conditions.
The rest of the proof is by contradiction. Assume that ∇2f(x∗) is not positive semidefinite. Then, choose some vector p that satisfies pt∇2f(x∗)p<0. Here, pt is the transpose matrix. Thus, since ∇2f is continuous near x∗, for some s>0, the following holds.
pt∇2f(x∗+ξp)p<0,∀ξ∈[0,s]
Also, by the Taylor expansion formula for multivariable functions,
f(x∗+ξp)=f(x∗)+ξpt∇f(x∗)+21ξ2pt∇2f(x∗+ξˉp)p,for some ξˉ∈(0,ξ)
Then, since ∇f(x∗)=0, by (1) and (2), we get the following.
f(x∗+ξp)<f(x∗),∀ξ∈[0,s]
This contradicts the fact that x∗ is a local minimizer. Therefore, the assumption is wrong, and ∇2f(x∗) is a positive semidefinite matrix.
■
Second-order sufficient conditions
Since the Hessian ∇2f at x∗ is continuous and positive definite, an open ball Br={x:∥x−x∗∥<r} can be selected so that ∇2f(x) is positive definite for all x∈B.
pt∇2f(x)p>0,∀p∈Rn
Now, for some vector p that is not 0 and satisfies ∥p∥<r, by the Taylor expansion formula for multivariable functions, we get the following.
f(x∗+p)=f(x∗)+pt∇f(x∗)+21pt∇2f(x+ξp)p,ξ∈(0,1)=f(x∗)+21pt∇2f(x+ξp)p
Here, since x∗+ξp∈B,
f(x∗+p)>f(x∗)
Therefore, x∗ is a local minimizer.
■
See also