[Optimization] KKT conditions

first-order condition


This section discusses the first-order condition (KKT) in detail,

Sequential Feasible Direction

Definition Let x ′ x'x is a feasible point,{ x ( k ) } \{x^{(k)}\}{ x( k ) }is a feasible sequence, satisfyingx ( k ) → x ′ x^{(k)} \to x'x(k)x , and∀ k , x ( k ) ≠ x ′ \forall k, x^{(k)} \neq x'k,x(k)=x , thenx ( k ) − x ′ x^{(k)} - x'x(k)x' forx' x'x , expressed as
x ( k ) − x ′ = δ kp ( k ) x^{(k)} - x' = \delta_k p^{(k)}x(k)x=dkp( k )
where $\delta_k > 0 $ andδ k → 0 \delta_k \to 0dk0 , thenp ( k ) p^{(k)}p( k ) is a vector of fixed length, calledp ( k ) p^{(k)}pAny gathering pointpp of ( k )p is the feasible regionΩ \OmegaΩ atx ' x'x , the feasible directions of the sequence at all are denoted asF ′ \mathcal{F}'F

By definition, a sequence feasible direction determines a feasible sequence.

For example, the feasible point x ′ = 0 x' =0x=0 , forΩ 1 = { x ∈ R 2 : x 2 ≥ x 1 2 } \Omega_1 = \{x\in \mathbb{R}^2:x_2\geq x_1^2\}Oh1={ xR2:x2x12} , then its sequence feasible direction needs to satisfyp 2 > 0 p_2 > 0p2>0 is fine.

And for Ω 2 = { x ∈ R 2 : x 2 = x 1 2 } \Omega_2 = \{x\in \mathbb{R}^2:x_2 = x_1^2\}Oh2={ xR2:x2=x12} , the feasible direction of the sequence needs to satisfy
p 1 ≠ 0 , p 2 = 0 p_1 \neq 0, p_2 = 0p1=0,p2=0

first-order necessary condition

May wish to set fff at the feasible pointx ′ x’xThe set of descending directions at
is D ′ = { p ∈ R n ∣ p T g ′ < 0 } D' = \{p\in \mathbb{R}^n|p^Tg' < 0\}D={ pRnpTg<0}

Lemma Let x ∗ x^*x is a local minimum point of the constrained problem, then atxxThere is no sequence feasible direction at x ^* is the descending direction, that is,
F ∗ ∩ D ∗ = ∅ \mathcal{F}^* \cap {D}^* = \varnothingFD=

In other words, the above lemma states that if x ∗ x^*x is a local minimum point, then the objective function is atx ∗ x^*x has a non-negative derivative along any feasible direction of the sequence.

Linearizable Feasible Direction

However, the set F ∗ \mathcal{F}^*F is usually not easy to calculate, consider another feasible direction set which is easy to calculate. Constraint functionci c_iciat x ' x'x The first-order Taylor approximation at
ci ( x ′ + s ) ≈ ci ( x ′ ) + ∇ ci ( x ′ ) T s c_i(x' + s) \approx c_i(x') + \nabla c_i(x ')^T sci(x+s)ci(x)+ci(x)T s
defines the pointx ' x'xThe feasible direction of linearizationat is to satisfy
p T ai ′ = 0 , i ∈ E p T ai ′ ≤ 0 , i ∈ I \begin{aligned} p^T a_i' \textcolor{red}{=} 0, i \in\mathcal{E} \\ p^T a_i' \textcolor{blue}{\leq} 0, i\in\mathcal{I} \end{aligned}pT ai=0,iEpT ai0,iI
The nonzero vector pp ofp , the set of all linearized feasible directions isF ′ F'F

Obviously, if F ′ \mathcal{F}'F' andF 'F'F Same, it will be convenient.

Lemma F ′ ⊆ F ′ \mathcal{F}' \subseteq F'FF

Prove that if p ∈ F ′ p \in \mathcal{F}'pF , then there exists a feasible sequence{ x ( k ) } \{x^{(k)}\}{ x( k ) }satisfy
x ( k ) − x ′ = δ kp ( k ) x^{(k)} - x' = \delta_k p^{(k)}x(k)x=dkp( k )
whereδ k → 0 , p ( k ) → p \delta_k \to 0, p^{(k)} \to pdk0,p(k)p . Expand the constraintci c_iciat x ' x'xThe Taylor function
ci ( x ( k ) ) = ci ( x ′ ) + δ kp ( k ) ai ′ + o ( δ k ) c_i(x^{(k)}) = c_i(x') + \delta_k p^{(k)} a_i' + o(\delta_k)ci(x(k))=ci(x)+dkp(k)ai+o ( dk)
optoi∈ E i \in \mathcal{E}iE,有 c i ( x ( k ) ) = c i ( x ′ ) = 0 c_i(x^{(k)}) = c_i(x') = 0 ci(x(k))=ci(x)=0 ;op toi ∈ I i \in \mathcal{I}iI,有 c i ( x ( k ) ) ≤ c i ( x ′ ) = 0 c_i(x^{(k)}) \leq c_i(x') = 0 ci(x(k))ci(x)=0 For example, give
ci ( x ( k ) ) δ k = ci ( x ′ ) δ k + p ( k ) ai ′ + o ( δ k ) δ k \frac{c_i(x^{(k)}) }{\delta_k} = \frac{c_i(x')}{\delta_k} + p^{(k)} a_i' + \frac{o(\delta_k)}{\delta_k}dkci(x(k))=dkci(x)+p(k)ai+dko ( dk)
For k → ∞ k \to \inftyk p ∈ F ′ p \in F' pF , the lemma is proved.

Unfortunately, F ' ⊆ F ′ F' \subseteq \mathcal{F}'FF' Not necessarily true.

Example Define the set
Ω = { x ∈ R 2 : x 2 ≤ x 1 3 , x 2 ≥ 0 } \Omega = \{x\in\mathbb{R}^2:x_2 \leq x_1^3,x_2\geq 0 \}Oh={ xR2:x2x13,x20 }
Consider the feasible pointx ′ = ( 0 , 0 ) T x' = (0,0)^Tx=(0,0)T , linearized feasible directionp = ( − 1 , 0 ) T ∈ F ′ p = (-1,0)^T \in F'p=(1,0)TF , obviously there is no feasible direction to converge toppp , iep ∉ F ′ p \notin \mathcal{F}'p/F

counterexample

constraint specification

Constraint specification ( constraint quality, CQ ) refers to the guarantee that F ' = F ′ F' = \mathcal{F}'F=F assumptions. It should be noted that it israre that the constraint specification fails.

Lemma At feasible point x ′ x'x , if the condition

(1) LCQ: c i ( x ) , i ∈ A ′ c_i(x), i \in \mathcal{A}' ci(x),iA is a linear function, or

(2) LICQ: a i ′ , i ∈ A ′ a_i', i \in \mathcal{A}' ai,iA' linearly independent

established, then there is F ' = F ′ F' = \mathcal{F}'F=F

Farkas Lemma

Farkas Lemma Given nnn- dimensional vectora 1 , a 2 , ⋯ , am a_1,a_2,\cdots,a_ma1,a2,,amand ggg,集合
S = { p ∈ R n : p T g < 0 , p T a i ≤ 0 , i = 1 , 2 , ⋯   , m } S = \{ p \in \mathbb{R}^n: p^Tg<0,p^Ta_i \leq 0, i = 1,2,\cdots,m \} S={ pRn:pTg<0,pT ai0,i=1,2,,m }
is an empty setif and only ifthere existsλ i ≥ 0 , i = 1 , 2 , ⋯ , m \lambda_i \geq 0,i =1,2,\cdots,mli0,i=1,2,,m,使得
− g = ∑ i = 1 m a i λ i -g = \sum_{i=1}^m a_i \lambda_i g=i=1maili

Given R n \mathbb{R}^nRVectora 1 , a 2 , ⋯ , am a_1,a_2,\cdots,a_m in na1,a2,,am,令
C = { v ∈ R n : v = ∑ i = 1 m a i λ i , λ i ≥ 0 } C = \left\{ v \in \mathbb{R}^n:v=\sum_{i=1}^m a_i \lambda_i,\lambda_i \geq 0 \right\} C={ vRn:v=i=1maili,li0 }
thenCCC is apolygonal coneand is aclosed convex set,

If the vector a ∈ C a \in CaC , then there exists a hyperplane separationCCCwaaa __a , that is, there exists a non-zero vectorppp 使得
p T a > 0 , p T v ≤ 0 , ∀ v ∈ C p^Ta > 0,p^Tv\leq 0,\forall v \in C pT a>0,pTv0,vC

To connect the necessary conditions of the lemma with the Lagrange multipliers, it is necessary to extend the Farkas lemma to the case of equality.

Corollary Given R n \mathbb{R}^nRn mediumg ∗ , ai ∗ , i ∈ E , ai ∗ , i ∈ I ∗ g^*,a_i^*,i\in \mathcal{E},a_i^*,i\in\mathcal{I} ^*g,ai,iE,ai,iI,则集合
S = { p ∈ R n : p T g ∗ < 0 , p T a i ∗ = 0 , i ∈ E , p T a i ∗ ≤ 0 , i ∈ I ∗ } S = \{ p \in \mathbb{R}^n: p^Tg^*<0, p^Ta_i^*= 0,i\in \mathcal{E},p^Ta_i^* \leq 0, i \in \mathcal{I}^*\} S={ pRn:pTg<0,pT ai=0,iE,pT ai0,iI }
is empty and existsλ i ∗ , i ∈ E , λ i ∗ ≥ 0 , i ∈ I ∗ \lambda_i^*,i\in \mathcal{E},\lambda_i^*\geq 0, i\in \mathcal{I}^*li,iE,li0,iI,使得
− g ∗ = ∑ i ∈ E λ i ∗ a i ∗ + ∑ i ∈ I ∗ λ i ∗ a i ∗ -g^* = \sum_{i\in\mathcal{E}} \lambda_i^*a_i^* + \sum_{i\in\mathcal{I}^*} \lambda_i^* a_i^* g=iEliai+iIliai

The KKT condition can be proved by the above deduction.

Regularity assumption 1 : F ∗ ∩ D ∗ = F ∗ ∈ D ∗ F^*\cap D^* = \mathcal{F}^* \in D^*FD=FD

If x ∗ x^*x is a local minimum point, and atx ∗ x^*xThe regularity assumption 1 at ∗ holds, then
F ∗ ∩ D ∗ = ∅ F^*\cap D^* = \varnothingFD=∅According
to Farkas lemma, we know that∃ λ i ∗ ∈ A ∗ \exists \lambda_i^* \in \mathcal{A}^*λiA , andλ i ∗ ≥ 0 , i ∈ I ∗ \lambda_i^* \geq 0,i \in \mathcal{I}^*li0,iI,使得
g ∗ + ∑ i ∈ E λ i ∗ a i ∗ + ∑ i ∈ I ∗ λ i ∗ a i ∗ = 0 g^* + \sum_{i \in \mathcal{E}} \lambda_i^* a_i^* + \sum_{i\in\mathcal{I}^*}\lambda_i^*a_i^* = 0 g+iEliai+iIliai=0
i ∈ I \ I ∗ i \in \mathcal{I} \backslash \mathcal{I}^* iI\I 时,有 c i ( x ∗ ) < 0 c_i(x^*) < 0 ci(x)<0 , thenλ i ∗ = 0 \lambda^*_i = 0li=0

KKT conditions

Theorem (first-order necessary condition) If x ∗ x^*x is a local minimum point andx ∗ x^*x regularity assumption
F ∗ ∩ D ∗ = F ∗ ∩ D ∗ F^* \cap D^* = \mathcal{F}^* \cap {D}^*FD=FD
holds, then there is a Lagrange multiplierλ ∗ \lambda^*l Makex ∗ , λ ∗ x^*,\lambda^*x,l 满足
∇ x L ( x ∗ , λ ∗ ) = 0 c i ( x ∗ ) = 0 , i ∈ E c i ( x ∗ ) ≤ 0 , i ∈ I λ i ∗ ≥ 0 , i ∈ I λ i ∗ c i ( x ∗ ) = 0 , i ∈ I \begin{aligned} \nabla_x \mathcal{L}(x^*,\lambda^*) & = 0\\ c_i(x^*) & = 0, i \in \mathcal{E} \\ c_i(x^*) & \leq 0, i \in \mathcal{I} \\ \lambda_i^* &\geq 0, i \in \mathcal{I} \\ \lambda_i^*c_i(x^*) & = 0, i \in \mathcal{I} \end{aligned} xL(x,l)ci(x)ci(x)lilici(x)=0=0,iE0,iI0,iI=0,iI

The regularity assumption is that for the vector a ′ ( i ∈ A ′ ) a' ~ (i \in \mathcal{A}')a (iA )for further relaxation.

Theorem Let x ∗ x^*x is a local minimum point of the constraint problem, and atx ∗ x^*x where LCQ or LICQ is established, thenx ∗ x^*x Satisfies KKT conditions..

Example Consider the problem
min ⁡ x 2 s . t . x 2 ≤ x 1 3 , x 2 ≥ 0 (1) \begin{aligned} \min ~~&x_2\\ \mathrm{st}~~&x_2 \leq x_1^3 ,x_2\geq 0 \end{aligned} \tag{1}min  s.t.  x2x2x13,x20(1)

min ⁡    x 1 s . t .    x 2 ≤ x 1 3 , x 2 ≥ 0 (2) \begin{aligned} \min ~~&x_1\\ \mathrm{s.t.}~~&x_2 \leq x_1^3,x_2\geq 0 \end{aligned} \tag{2} min  s.t.  x1x2x13,x20( 2 )
In the solutionx ∗ = ( 0 , 0 ) T x^*=(0,0)^Tx=(0,0)At T , it is easy to verify that (1) the regularity assumption is satisfied, and (2) the regularity assumption is not satisfied.

references

[1] Liu Hongying, Xia Yong, Zhou Yongsheng. Fundamentals of Mathematical Programming, Beijing, 2012.

Guess you like

Origin blog.csdn.net/qq_38904659/article/details/112435742