[Optimization Method Study Notes] Chapter 3: Constrained Optimization Method

1. Constrained optimization problem

1.1 General form of constrained optimization problem

The general form of the constrained optimization problem is min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} < /span>mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,m f ( x ) f(\boldsymbol{x}) f(x)Object function number, h i ( x ) = 0 h_i(\boldsymbol{x}) = 0 hi(x)=0Equation promise, h j ( x ) ≤ 0 h_j(\boldsymbol{x}) \le 0 hj(x)0不等式约束
称集合 Ω = { x ∣ h i ( x ) = 0 , h j ( x ) ≤ 0 , i = 1 , 2 , ⋯   , l , j = l + 1 , l + 2 , ⋯   , m } \varOmega = \left \lbrace \boldsymbol{x} | h_i(\boldsymbol{x}) = 0, h_j(\boldsymbol{x}) \le 0, i = 1, 2, \cdots, l, j = l+1, l+2, \cdots, m \right \rbrace Oh={ xhi(x)=0,hj(x)0,i=1,2,,l,j=l+1,l+2,,m}Allowable range.

1.2 Feasible direction and feasible descent direction

d \boldsymbol{d} dNon-zero direction quantity, x ∈ Ω \boldsymbol{x} \in \varOmega xΩ, 若 ∃ k > 0 \exists k > 0 k>0, 使得 ∀ α ∈ ( 0 , k ) \forall \alpha \in (0, k) α(0,k), 都有 x + α d ∈ Ω \boldsymbol{x} + \alpha \boldsymbol{d} \in \varOmega x+αdΩ, symmetric direction quantity d \boldsymbol{d} d为point x \boldsymbol{x} x feasible direction, if it still satisfies f ( x + α d ) < f ( x ) f(\boldsymbol{x} + \alpha \boldsymbol{d}) < f(\boldsymbol{x}) f(x+αd)<f(x), 则名 d \boldsymbol{d} d为point x \boldsymbol{x} feasible descent direction at ). Possible directions for improvement (or x

1.3 Effective indicator set

对于Point x ∈ Ω \boldsymbol{x} \in \varOmega xΩ, 称集合 A ( x ) = { i ∣ h i ( x ) = 0 } A(\boldsymbol{x}) = \left \lbrace i | h_i(\boldsymbol{x}) = 0 \right \rbrace A(x)={ ihi(x)=0}为point x \boldsymbol{x} x'seffective indicator set, intuitively speaking, the effective indicator set is all etc. The set of the subscripts of expression constraints and the subscripts of all inequality constraints with the equal sign.

2. KKT conditions

Suppose the general form of constrained optimization problem is at point x \boldsymbol{x} x处满 Foot:Directions组 ∇ h i ( x ) \nabla h_i(\boldsymbol{x}) hi(x), i ∈ A ( x ) i \in A(\boldsymbol{x}) iA(x)Wirelessness, problem exists Point x \boldsymbol{x} x处的拉格郎日function L ( x , λ ) = f ( x ) + ∑ i = 1 m λ i h i ( x ) L(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{ i=1}^{m} \lambda_i h_i(\boldsymbol{x}) L(x,λ)=f(x)+i=1mlihi(x)Where is the problem x \ boldsymbol{x} x处的KKT条件 { ∇ x L ( x , λ ) = 0 h i ( x ) = 0 , i = 1 , 2 , ⋯   , l h j ( x ) ≤ 0 , j = l + 1 , l + 2 , ⋯   , m λ j h j ( x ) = 0 , j = l + 1 , l + 2 , ⋯   , m λ j ≥ 0 , j = l + 1 , l + 2 , ⋯   , m \begin{cases} \nabla_{\boldsymbol{x}} L(\boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 \\ h_i(\boldsymbol{x}) = 0, i = 1, 2, \cdots, l \\ h_j(\boldsymbol{x}) \le 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j h_j(\boldsymbol{x}) = 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j \ge 0, j = l+1, l+2, \cdots, m \end{cases} xL(x,λ)=0hi(x)=0,i=1,2,,lhj(x)0,j=l+1,l+2,,mljhj(x)=0,j=l+1,l+2,,mlj0,j=l+1,l+2,,m若点 x \boldsymbol{x} xComplete KKT condition, other name x \boldsymbol{x} xKKT point, mutually ( x , λ ) (\boldsymbol{x}, \boldsymbol{\lambda}) (x,λ)Name为KKT对.

[Example 1] Find all KKT points of the following problems min ⁡ x 1 x 2 s . t . x 1 2 + x 2 2 = 1 \begin{matrix} \ min & x_1x_2 \\ \rm{s.t.} & x_1^2 + x_2^2 = 1 \end{matrix} mins.t.x1x2x12+x22=1[Solution] Construct Lagrangian function L ( x 1 , x 2 , λ ) = x 1 x 2 + λ x 1 2 + λ x 2 2 − λ L(x_1, x_2, \lambda) = x_1x_2 + \lambda x_1^2 + \lambda x_2^2 - \lambda L(x1,x2,λ)=x1x2+λx12+λx22λKKT条件为 { x 2 + λ x 1 = 0 x 1 + λ x 2 = 0 x 1 2 + x 2 2 = 1 \begin{cases} x_2 + \lambda x_1 = 0 \\ x_1 + \lambda x_2 = 0 \\ x_1^2 + x_2^2 = 1 \end{cases} x2+λx1=0x1+λx2=0x12+x22=1x 1 = x 2 = ± 2 2 , λ = − 1 2 x_1 = x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = -\dfrac{1}{2}x1=x2=±22 ,l=21 x 1 = − x 2 = ± 2 2 , λ = 1 2 x_1 = -x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = \dfrac{1}{2}x1=x2=±22 ,l=21So the KKT point is ( 2 2 , 2 2 ) T \left( \dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right )^{\rm T} (22 ,22 )T, ( − 2 2 , 2 2 ) T \left( -\dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T, ( 2 2 , − 2 2 ) T \left( \dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T, ( − 2 2 , − 2 2 ) T \left( -\dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T

[Example 2] Judgment point x 0 = ( 1 , 3 ) T \boldsymbol{x}_0 = (1, 3)^{\rm T} x0=(1,3)Is T the KKT point of the following problem min ⁡ 4 x 1 − 3 x 2 s . t . x 1 + x 2 ≤ 4 x 2 + 7 ≥ 0 ( x 1 − 3 ) 2 ≤ 1 + x 2 \begin{matrix} \min & 4x_1 - 3x_2 \\ \rm{s.t.} & x_1 + x_2 \le 4 \\ & x_2 + 7 \ge 0 \\ & (x_1 - 3)^2 \le 1 + x_2 \end{matrix} mins.t.4x13x2x1+x24x2+70(x13)21+x2[Answer] Point x 0 \boldsymbol{x}_0 x0The set of active indicators at is A ( x 0 ) = { 1 , 3 } A(\boldsymbol{x}_0) = \lbrace 1, 3 \rbrace < /span>A(x0)={ 1,3}, because λ 2 = 0 \lambda_2 = 0 l2=0, construct Lagrangian function L ( x 1 , x 2 , λ 1 , λ 3 ) = 4 x 1 − 3 x 2 + λ 1 ( x 1 + x 2 − 4 ) + λ 3 [ ( x 1 − 3 ) 2 − x 2 − 1 ] L(x_1, x_2, \lambda_1, \lambda_3) = 4x_1 - 3x_2 + \ lambda_1(x_1 + x_2 - 4) + \lambda_3\left[(x_1-3)^2 - x_2 - 1\right] L(x1,x2,l1,l3)=4x13x2+l1(x1+x24)+l3[(x13)2x21]KKT EXPORTS { 4 + λ 1 + 2 λ 3 x 1 − 6 λ 3 = 0 − 3 + λ 1 − λ 3 = 0 λ 1 ≥ 0 , λ 3 ≥ 0 \begin{cases} 4 + \lambda_1 + 2\lambda_3x_1- 6\lambda_3 = 0 \\ -3 + \ . lambda_1 - \lambda_3 = 0 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} 4+l1+2λ3x16λ3=03+l1l3=0l10,l30General x 1 = 1 x_1 = 1 x1=1sum x 2 = 3 x_2 = 3 x2=3Therefore, the equation { λ 1 − 4 λ 3 = − 4 λ 1 − λ 3 = 3 λ 1 ≥ 0 , λ 3 ≥ 0 \begin {cases} \lambda_1 - 4\lambda_3 = -4 \\ \lambda_1 - \lambda_3 = 3 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} l14λ3=4l1l3=3l10,l30The above equation has a solution: λ 1 = 16 3 ≥ 0 \lambda_1 = \dfrac{16}{3} \ge 0 l1=3160, λ 3 = 7 3 ≥ 0 \lambda_3 = \dfrac{7}{3}\ge 0l3=370, reason x 0 \boldsymbol{x}_0 x0It is the KKT point.

3. Secondary planning

3.1 General form of quadratic programming

The objective function is called a quadratic function and the constraints are linear constraints. The constrained optimization problem isquadratic programming. The general form of quadratic programming The form is min ⁡ 1 2 x T G x + c T x s . t . a i T x = b i i = 1 , 2 , ⋯ , l a j T x ≤ b j j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{ \rm T} \boldsymbol{x}\\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{x} = b_i & i = 1, 2, \cdots, l \ \ & \boldsymbol{a}_j^{\rm T} \boldsymbol{x} \le b_j & j = l+1, l+2, \cdots, m \end{matrix} mins.t.21xTGx+cTxaiTx=biajTxbji=1,2,,lj=l+1,l+2,,m

3.2 Equality constrained quadratic programming

If the quadratic programming problem does not contain inequality constraints, the problem degenerates into min ⁡ 1 2 x T G x + c T x s . t . A x = b \begin{matrix } \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{\rm T} \boldsymbol{x }\\ \rm {s.t.} & \boldsymbol{A} \boldsymbol{x} = \boldsymbol b \end{matrix} mins.t.21xTGx+cTxAx=bIf matrix G \boldsymbol{G} Gsemidefinite, and A \boldsymbol{A} All rows of A are linearly independent, then the KKT point of the problem is equivalent to the optimal solution of the problem. Just need to solve the system of linear equations [ G A T A O ] [ x λ ] = [ − c b ] \begin{bmatrix} \boldsymbol{G} & \boldsymbol{A}^{\rm T} \\ \boldsymbol{A} & \boldsymbol{O} \end{bmatrix} \begin{bmatrix} \boldsymbol{x} \\ \boldsymbol{\lambda} \end{bmatrix} = \begin{bmatrix} \boldsymbol {-c} \\ \boldsymbol{b} \end{bmatrix} [GAATO][xλ]=[cb] can get the optimal solution to the problem.

[Example 3] Solve quadratic programming problem min ⁡ x 1 2 + x 2 2 + x 3 2 − x 1 x 2 − x 2 x 3 + 2 x 1 − x 2 s . t . 3 x 1 − x 2 − x 3 = 0 2 x 1 − x 2 − x 3 = 0 \begin{matrix} \min & x_1^2 + x_2^2 + x_3^2 - x_1x_2 - x_2x_3 + 2x_1 - x_2\\ \rm {s.t.} & 3x_1 - x_2 - x_3 = 0 \\ & 2x_1 - x_2 - x_3 = 0 \end{matrix} mins.t.x12+x22+x32x1x2x2x3+2x1x23x1x2x3=02x1x2x3=0[Solution] Transform the problem into matrix form min ⁡ 1 2 [ x 1 , x 2 , x 3 ] [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] [ x 1 x 2 x 3 ] + [ 2 , − 1 , 0 ] [ x 1 x 2 x 3 ] s . t . [ 3 − 1 − 1 2 − 1 − 1 ] [ x 1 x 2 x 3 ] = [ 0 0 ] \begin{matrix} \min & \dfrac{1}{2} [x_1, x_2, x_3] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + [2, -1, 0] \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \\ \\ \rm {s.t.} & \begin{bmatrix} 3 & -1 & -1 \\ 2 & -1 & -1 \end {bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{matrix} mins.t.21[x1,x2,x3] 210121012 x1x2x3 +[2,1,0] x1x2x3 [321111] x1x2x3 =[00]Where matrix [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} 210121012 In other words, the operating system [ 2 − 1 0 3 2 − 1 2 − 1 − 1 − 1 0 − 1 2 − 1 − 1 3 − 1 − 1 0 0 2 − 1 − 1 0 0 ] [ x 1 x 2 x 3 λ 1 λ 2 ] = [ − 2 1 0 0 0 ] \begin {bmatrix} 2 & -1 & 0 & 3 & 2 \\ -1 & 2 & -1 & -1 & -1 \\ 0 & -1 & 2 & -1 & -1 \\ 3 & -1 & -1 & 0 & 0 \\ 2 & -1 & -1 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \lambda_1 \\ \lambda_2 \end{bmatrix} = \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} 2103212111012113110021100 x1x2x3l1l2 = 21000 [x1,x2,x3,λ1,λ2] = [0,16,16,−56,13] [x_1 , x_2, x_3, \lambda_1, \lambda_2] = \left[0, \dfrac{1}{6}, \dfrac{1}{6}, -\dfrac{5}{6}, \dfrac{1} {3}\right][x1,x2,x3,l1,l2]=[0,61,61,65,31], so the optimal solution is [ x 1 , x 2 , x 3 ] = [ 0 , 1 6 , 1 6 ] [x_1, x_2, x_3] = \left[ 0, \dfrac{1}{6}, \dfrac{1}{6} \right] [x1,x2,x3]=[0,61,61], 上优值为 − 5 36 -\dfrac{5}{36} 365

3.3 Effective indicator set method

For the general form of quadratic programming problem, if G \boldsymbol{G} G is a positive definite matrix, then the following algorithm can obtain the optimal solution to the problem:

  1. Give the initial feasible point of the problem x \boldsymbol{x} x
  2. Initialized bottom set I ← A ( x ) I \gets A(\boldsymbol{x}) IA(x)
  3. w h i l e    T r u e    d o \bold{while} \; \rm{True} \; \bold{do} whileTruedo
  4. \qquad Solve the following quadratic programming problem with only equality constraints to get d \boldsymbol{d} d λ \boldsymbol{\lambda} λ min ⁡ d 1 2 d T G d + ( G x + c ) T d s . t . a i T d = 0 , i ∈ I \begin{matrix} \underset{\boldsymbol{d}}{\min} & \dfrac{1}{2} \boldsymbol{d}^{\rm T} \boldsymbol{G} \boldsymbol{d} + (\boldsymbol{G} \boldsymbol{x} + \boldsymbol{c})^{\rm T} \boldsymbol{d} \\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{d} = 0, i \in I \end{matrix} dmins.t.21dTGd+(Gx+c)TdaiTd=0,iI\qquad
  5. i f    d = 0    d o \qquad \bold {if} \; \boldsymbol{d}=\bold0 \; \bold{do} ifd=0do
  6. i f    λ ≥ 0    d o \qquad \qquad \bold{if} \; \boldsymbol{\lambda} \ge \bold0 \; \bold{do} ifl0do
  7. r e t u r n    x \qquad \qquad \qquad \bold{return} \; \boldsymbol{x} returnx
  8. e l s e \qquad \qquad \bold{else} else
  9. I ← I ∖ { arg min ⁡ λ i } \qquad \qquad \qquad I \gets I \setminus \left \lbrace \argmin \lambda_i \right \rbrace II{ argminli}
  10. e n d \qquad \qquad \bold{end} end
  11. e l s e \qquad \bold{else} else
  12. α ← min ⁡ i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \alpha \gets \underset{i \notin I}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace ai/Imin{ aiTdbiaiTxaiTd>0}
  13. i f    α < 1    d o \qquad \qquad \bold{if} \; \alpha < 1 \; \bold{do} ifa<1do
  14. i ← arg min ⁡ i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \qquad i \gets \underset{i \notin I}{\argmin} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace ii/Iargmin{ aiTdbiaiTxaiTd>0}
  15. I ←   I ∪ { i } \qquad \qquad \qquad I \gets \ I \cup \lbrace i \rbrace I I{ i}
  16. e l s e \qquad \qquad \bold{else} else
  17. α ← 1 \qquad \qquad \qquad \alpha \gets 1 a1
  18. e n d \qquad \qquad \bold{end}end
  19. x ← x + α d \qquad \qquad \boldsymbol{x} \gets \boldsymbol{x} + \alpha \boldsymbol{d} xx+αd
  20. e n d \qquad \bold{end} end
  21. e n d \bold{end} end

The above algorithm is calledactive indicator set method.

[Example 4] Solving quadratic programming problem min ⁡ ( x 1 − 1 ) 2 + ( x 2 − 2 ) 2 s . t . x 1 + x 2 ≤ 1 x 1 , x 2 ≥ 0 \begin{matrix} \min & (x_1 - 1)^2 + (x_2 - 2)^2 \\ \rm {s.t.} & x_1 + x_2 \le 1 \\ & ; x_1, x_2 \ge 0 \end{matrix} mins.t.(x11)2+(x22)2x1+x21x1,x20[Solution] The coefficients corresponding to this problem are G = [ 2 0 0 2 ] , c = [ − 2 − 4 ] , a 1 = [ 1 1 ] , a 2 = [ − 1 0 ] , a 3 = [ 0 − 1 ] , b = [ 1 0 0 ] \boldsymbol{G} = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} , \boldsymbol{c} = \begin{bmatrix} -2 \\ -4 \end{bmatrix}, \boldsymbol{a}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \boldsymbol{a }_2 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}, \boldsymbol{a}_3 = \begin{bmatrix} 0 \\ -1 \end{bmatrix}, \boldsymbol{b} = \begin {bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} G=[2002],c=[24],a1=[11],a2=[10],a3=[01],b= 100 显然, x 0 = [ 0 , 0 ] T \boldsymbol{x}_0 = [0, 0]^{\rm T} x0=[0,0]Tis possible, initialization I 0 = A ( 0 , 0 ) = { 2 , 3 } I_0 = A(0, 0) = \lbrace 2, 3 \rbrace I0=A(0,0)={ 2,3}

th 1 1 1 iterations:
Solve quadratic programming problem min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . − d 1 = 0 − d 2 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & ; -d_1= 0 \\ & -d_2 = 0 \end{matrix} mins.t.d12+d222d14d2d1=0d2=0 d = [ 0 , 0 ] T \boldsymbol{d} = [0, 0]^{\rm T} d=[0,0]T, λ = [ 0 , − 2 , − 4 ] T ≤ 0 \boldsymbol \lambda = [0, -2, -4]^{\rm T} \le \bold 0 l=[0,2,4]T0, argmin ⁡ λ i = 3 \argmin \lambda_i = 3 argminli=3, 更多 I 1 = { 2 } I_1 = \lbrace 2 \rbrace I1={ 2}

th 2 2 2 iterations:
Solve quadratic programming problem min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . − d 1 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & -d_1= 0 \end{matrix} mins.t.d12+d222d14d2d1=0 d = [ 0 , 2 ] T ≠ 0 \boldsymbol{d} = [0, 2]^{\rm T} \ne \bold 0 d=[0,2]T=0, 则计算 α = min ⁡ i ∈ { 1 , 3 } { b i − a i T x 0 a i T d ∣ a i T d > 0 } = 1 2 < 1 \alpha = \underset{i \in \lbrace 1, 3 \rbrace}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}_0}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace = \dfrac{1}{2} < 1 a=i{ 1,3}min{ aiTdbiaiTx0aiTd>0}=21<1, i = 1 i = 1 i=1, update I 2 = { 1 , 2 } I_2 = \lbrace 1, 2 \rbrace I2={ 1,2}, x 2 = [ 0 , 1 ] T \boldsymbol{x}_2 = [0, 1]^{\rm T} x2=[0,1]T

th 3 ​​3 3 iterations:
Solve the quadratic programming problem min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . d 1 + d 2 = 0 − d 1 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t. } & d_1 + d_2 = 0 \\ & -d_1 = 0 \end{matrix} mins.t.d12+d222d14d2d1+d2=0d1=0 d = [ 0 , 0 ] T \boldsymbol{d} = [0, 0]^{\rm T} d=[0,0]T, λ = [ 2 , 0 , 0 ] T ≥ 0 \boldsymbol \lambda = [2, 0, 0]^{\rm T} \ge \bold0 l=[2,0,0]T0, 迭代结结

So the optimal solution is [ 0 , 1 ] T [0, 1]^{\rm T} [0,1]T, 上优值为 1 1 1

4. Penalty function method and obstacle function method

4.1 Penalty function method

For the general form of constrained optimization problems, the penalty function method converts the problem into an unconstrained optimization problem by adding a penalty term, so that the original problem can be solved using the unconstrained optimization method. For the general form of constrained optimization problem min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,mThe solution steps of the penalty function method are:

  1. Set ρ 1 > 0 \rho_1 > 0 r1>0, precision e > 0 \varepsilon > 0 e>0, first starting point x 0 \boldsymbol{x}_0 x0, current iteration number k = 1 k = 1 k=1
  2. th k k k iterations, solve the unconstrained optimization problem min ⁡ P ( x , ρ k ) = f ( x ) + ρ k [ ∑ i = 1 l h i 2 ( x ) + ∑ j = l + 1 m ( max ⁡ { 0 , h j ( x ) } ) 2 ] \min P(\boldsymbol{x}, \rho_k) = f(\boldsymbol{ x}) + \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \sum_{j=l+1}^{m} (\max \left \ lbrace 0, h_j(\boldsymbol{x}) \right \rbrace)^2 \right] minP(x,rk)=f(x)+rk i=1lhi2(x)+j=l+1m(max{ 0,hj(x)})2 The optimal solution is x k \boldsymbol{x}_k xk
  3. 若惩罚项满足 ρ k [ ∑ i = 1 l h i 2 ( x k ) + ∑ j = l + 1 m ( max ⁡ { 0 , h j ( x k ) } ) 2 ] ≤ ε \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}_k) + \sum_{j=l+1}^{m} (\max \left \lbrace 0, h_j(\boldsymbol{x}_k) \right \rbrace)^2 \right] \le \varepsilon rk i=1lhi2(xk)+j=l+1m(max{ 0,hj(xk)})2 ε then the iteration ends, and the optimal solution is x k \boldsymbol{x}_k xk, else ρ k + 1 > ρ k \rho_{k+1} > \rho_k rk+1>rk, continue to iterate

Generally speaking, when solving the problem, after iterating once, let ρ → + ∞ \rho \to +\infty r+Immediately available.

[Example 5] Use penalty function method to solve min ⁡ x 1 2 + x 2 2 s . t . x 1 − 1 ≥ 0 x 1 + x 2 = 3 \ begin{matrix} \min & x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 - 1 \ge 0 \\ & x_1 + x_2 = 3 \end{matrix} mins.t.x12+x22x110x1+x2=3【解】令 P ( x 1 , x 2 , ρ ) = x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( max ⁡ { 0 , 1 − x 1 ] } ) 2 = { x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 , x 1 > 1 x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( 1 − x 1 ) 2 , x 1 ≤ 1 \begin{align} P(x_1, x_2, \rho) & = x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho \left( \max \lbrace 0, 1 - x_1] \rbrace \right)^2 \nonumber \\ & = \begin{cases} x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2, & x_1 > 1 \\ x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho(1 - x_1)^2, & x_1 \le 1 \end{cases} \nonumber \end{align} P(x1,x2,ρ)=x12+x22+r(x1+x23)2+r(max{ 0,1x1]})2={ x12+x22+r(x1+x23)2,x12+x22+r(x1+x23)2+ρ(1x1)2,x1>1x11 ∂ P ∂ x 1 = { 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) , x 1 > 1 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) − 2 ρ ( 1 − x 1 ) , x 1 ≤ 1 \dfrac{\partial P}{\partial x_1} = \begin{cases} 2x_1 + 2\rho ( x_1 + x_2 - 3), & x_1 > 1 \\ 2x_1 + 2\rho ( x_1 + x_2 - 3) - 2\rho(1 - x_1), & x_1 \le 1 \end{cases} x1P={ 2x1+2ρ(x1+x23),2x1+2ρ(x1+x23)2ρ(1x1),x1>1x11 ∂ P ∂ x 2 = 2 x 2 + 2 ρ ( x 1 + x 2 − 3 ) \dfrac{\partial P}{\partial x_2} = 2x_2 + 2\rho ( x_1 + x_2 - 3) x2P=2x2+2ρ(x1+x23) ∂ P ∂ x 1 = ∂ P ∂ x 2 = 0 \dfrac{\partial P}{\partial x_1} = \dfrac{\partial P}{\partial x_2} = 0 x1P=x2P=0解得 x = { ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T , ρ > 1 ρ ρ 2 + 3 ρ + 1 ( ρ + 4 , 2 ρ + 3 ) T , 0 < ρ ≤ 1 \boldsymbol{x} = \begin{cases} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T}, & \rho > 1 \\ \dfrac{\rho}{\rho^2 + 3\rho + 1} \left( \rho + 4, 2\rho + 3 \right)^{\rm T}, & 0 < \rho \le 1 \end{cases} x= (2ρ+13ρ,2ρ+13ρ)T,r2+3ρ+1ρ(ρ+4,2ρ+3)T,r>10<r1So the optimal solution to the original problem is lim ⁡ ρ → + ∞ x = lim ⁡ ρ → + ∞ ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T = ( 3 2 , 3 2 ) T \underset{\rho \to +\infty}{\lim} \boldsymbol{x} = \underset{\rho \to +\infty}{\lim} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T} = \left( \dfrac{3}{2} , \dfrac{3}{2} \right)^{\rm T} ρ+limx=ρ+lim(2ρ+13ρ,2ρ+13ρ)T=(23,23)TMost 优值为 9 2 \dfrac{9}{2} 29

4.2 Obstacle function method

For constrained optimization problems with only inequality constraints min ⁡ f ( x ) s . t . h i ( x ) ≤ 0 i = 1 , 2 , ⋯ , m \begin {matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) \le 0 & i = 1, 2, \cdots, m \end{ matrix} mins.t.f(x)hi(x)0i=1,2,,mThe barrier function method constructs a barrier function b ( x ) b(\boldsymbol{x}) b(x)General transfer Promise improvement problem, disability function b ( x ) b(\boldsymbol{x}) b(x)Usually present and below Construction method: b 1 ( x ) = − ∑ i = 1 m 1 h i ( x ) , b 2 ( x ) = − ∑ i = 1 m ln ⁡ [ − h i ( x ) ] b_1(\boldsymbol{x}) = -\sum_{i=1}^{m}\dfrac{1}{h_i(\boldsymbol{x})}, b_2(\boldsymbol{x}) = -\sum_ {i=1}^{m}\ln \left[-h_i(\boldsymbol{x}) \right] b1(x)=i=1mhi(x)1,b2(x)=i=1mln[hi(x)] b 1 ( x ) b_1(\boldsymbol{x}) b1(x)increase number impairment function, b 2 ( x ) b_2(\boldsymbol{x}) b2(x) is the logarithmic barrier function. The solution steps of the obstacle function method are

  1. 给定 r 1 > 0 r_1 > 0 r1>0, precision e > 0 \varepsilon > 0 e>0, first starting point x 0 \boldsymbol{x}_0 x0, current iteration number k = 1 k = 1 k=1
  2. th k k k iterations, solve the unconstrained optimization problem min ⁡ B ( x , r k ) = f ( x ) + r k b ( x ) \min B(\boldsymbol{x}, r_k) = f(\boldsymbol{x}) + r_kb(\boldsymbol{x}) minB(x,rk)=f(x)+rkb(x)Get the best solution x k \boldsymbol{x}_k xk
  3. If the penalty term satisfies r k b ( x k ) ≤ ε r_kb(\boldsymbol{x}_k) \le \varepsilon rkb(xk)ε then the iteration ends, and the optimal solution is x k \boldsymbol{x}_k xk, not take r k + 1 ∈ ( 0 , r k ) r_{k+1} \in (0, r_k) rk+1(0,rk), 继续迭代

Generally speaking, when solving the problem, after iterating once, let r → 0 + r \to 0^+ r0+It is the best solution that can be obtained immediately.

[Example 6] Solve by obstacle function method min ⁡ x 1 2 + x 2 2 s . t . x 1 − x 2 + 1 ≤ 0 \begin{matrix} \min & x_1^2 + x_2^2\\ \rm {s.t.} & x_1 - x_2 + 1 \le 0 \end{matrix} mins.t.x12+x22x1x2+10【解】令 B ( x 1 , x 2 , r ) = x 1 2 + x 2 2 − r ln ⁡ ( x 2 − x 1 − 1 ) B (x_1, x_2, r) = x_1^2 + x_2^2 - r\ln(x_2 - x_1 - 1) B(x1,x2,r)=x12+x22rln(x2x11) ∂ B ∂ x 1 = 2 x 1 + r x 2 + x 1 − 1 , ∂ B ∂ x 2 = 2 x 2 − r x 2 + x 1 − 1 \dfrac{\partial B}{\partial x_1} = 2x_1 + \dfrac{r}{x_2 + x_1 - 1}, \dfrac{\partial B}{\partial x_2} = 2x_2 - \dfrac{r}{x_2 + x_1 - 1} x1B=2x1+x2+x11r,x2B=2x2x2+x11r ∂ B ∂ x 1 = ∂ B ∂ x 2 = 0 \dfrac{\partial B}{\partial x_1} = \dfrac{\partial B}{\partial x_2} = 0 x1B=x2B=0解得 x = ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T \boldsymbol{x} = \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} x=(41+1+r ,41+1+r )TSo the optimal solution to the original problem is lim ⁡ r → 0 + x = lim ⁡ r → 0 ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T = ( − 1 2 , 1 2 ) T \underset{r \to 0^+}{\lim} \boldsymbol{x} = \underset{r \to 0} {\lim} \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} = \left( -\dfrac{1}{2}, \dfrac{1}{2} \right)^{\rm T} r0+limx=r0lim(41+1+r ,41+1+r )T=(21,21)TMost 优值为 1 2 \dfrac{1}{2} 21

4.3 Mixed penalty function method

The hybrid penalty function method comprehensively uses the penalty function and the obstacle function, and the objective function is F ( x , r ) = f ( x ) + + r b ( x ) + p ( x ) r F(\boldsymbol{x}, r) = f(x) + + rb(\boldsymbol{x}) + \dfrac{p(\boldsymbol{x})}{r} F(x,r)=f(x)++rb(x)+rp(x)in expression, b ( x ) b(\boldsymbol{x}) b(x)impairment function, < /span> p ( x ) p(\boldsymbol{x}) p(x)为惩罚function.

5. Augmented Lagrangian function method

For the general form of constrained optimization problem min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯ , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯ , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} < /span>mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,m定义其增广拉格朗日函数 L σ ( x , λ ) = f ( x ) + ∑ i = 1 l λ i h i ( x ) + σ 2 ∑ i = 1 l h i 2 ( x ) + 1 2 σ ∑ j = l + 1 m { [ max ⁡ { 0 , λ j + σ h j ( x ) } ] 2 − λ j 2 } L_\sigma(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{i=1}^{l}\lambda_i h_i(\boldsymbol{x}) + \dfrac{\sigma}{2}\sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \dfrac{1}{2\sigma} \sum_{j=l+1}^{m} \left \lbrace \left[ \max \left \lbrace 0, \lambda_j + \sigma h_j(\boldsymbol{x}) \right \rbrace \right]^2 - \lambda_j^2 \right \rbrace Lσ(x,λ)=f(x)+i=1llihi(x)+2σi=1lhi2(x)+2σ1j=l+1m{ [max{ 0,lj+σhj(x)}]2lj2} k k k iterations, let ∇ x L ( x , λ ) = 0 \nabla_{\boldsymbol{x}}L(\ boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 xL(x,λ)=0解得 x k \boldsymbol{x}_k xk, the infinitesimal infinitesimal equation ( λ k + 1 ) i = { ( λ k ) i + σ h i ( x k ) , 1 ≤ i ≤ l max ⁡ { 0 , ( λ k ) i + σ h i ( x k ) } , l < i ≤ m (\ballsymbol{\lambda}_{k+1})_i = \begin{cases}(\ballsymbol{\lambda}_k)_i + \sigma h_i(\ballsymbol{x}_k), & 1 \le i \le l \\ \max \lbrace 0, (\ballsymbol{\lambda}_k)_i + \sigma h_i(\ballsymbol{x}_k) \rbrace, & l < i \le m \end{cases}(λk+1)i={ (λk)i+σhi(xk),max{ 0,(λk)i+σhi(xk)},1ill<imGenerally speaking, when solving the problem, iterate once and then calculate λ k \boldsymbol{\lambda}_k lkThe optimal solution can be obtained by substituting the limit value into the limit.

[Example 7] Use the augmented Lagrangian function method to solve min ⁡ 3 x 1 2 + x 2 2 s . t . x 1 + x 2 = 1 \ begin{matrix} \min & 3x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 + x_2 = 1 \end{matrix} mins.t.3x12+x22x1+x2=1【解】令 L σ ( x 1 , x 2 , λ ) = 3 x 1 2 + x 2 2 + λ ( x 1 + x 2 − 1 ) + σ 2 ( x 1 + x 2 − 1 ) 2 L_\sigma(x_1, x_2, \lambda) = 3x_1^2 + x_2^2 + \lambda(x_1 + x_2 - 1) + \dfrac{\sigma}{2}(x_1 + x_2 - 1)^2 Lσ(x1,x2,λ)=3x12+x22+λ(x1+x21)+2σ(x1+x21)2 ∇ x L σ ( x 1 , x 2 , λ ) = [ 6 x 1 + λ + σ ( x 1 + x 2 − 1 ) , 2 x 2 + λ + σ ( x 1 + x 2 − 1 ) ] T \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = [6x_1 + \lambda + \sigma(x_1 + x_2 - 1), 2x_2 + \lambda + \sigma(x_1 + x_2 - 1)]^{\rm T} xLσ(x1,x2,λ)=[6x1+l+σ(x1+x21),2x2+l+σ(x1+x21)]T ∇ x L σ ( x 1 , x 2 , λ ) = 0 \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = \bold0 xLσ(x1,x2,λ)=0Solution x k = [ σ − λ k 4 σ + 6 , 3 σ − 3 λ k 4 σ + 6 ] T \bold symbol {x}_k = \left[ \dfrac{\sigma - \lambda_k}{4\sigma + 6}, \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} \right]^{ \rm T}xk=[4σ+6plk,4σ+63σ3λk]TDefine λ k + 1 = λ k + σ ( σ − λ k 4 σ + 6 + 3 σ − 3 λ k 4 σ + 6 − 1 ) = 3 ( λ k − σ ) 2 σ + 3 \lambda_{k+1} = \lambda_k + \sigma \left( \dfrac{\sigma - \lambda_k}{4\sigma + } + \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} - 1 \right) = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} lk+1=lk+p(4σ+6plk+4σ+63σ3λk1)=2σ+33(λkσ) λ 1 > − 3 2 \lambda_1 > -\dfrac{3}{2} l1>23When , it is easy to prove by mathematical induction λ k > − 3 2 \lambda_k > -\dfrac{3}{2} lk>23, given λ k + 1 − λ k = 3 ( λ k − σ ) 2 σ + 3 − λ k = − σ 2 σ + 3 ( 3 + 2 λ k ) < 0 \lambda_{k+1} - \lambda_k = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} - \lambda_k = -\dfrac{\sigma}{2\sigma + 3}( 3 + 2\lambda_k) < 0 lk+1lk=2σ+33(λkσ)lk=2σ+3σ(3+2λk)<0即数列 { λ k } \lbrace \lambda_k \rbrace { λk}单调递减且圔圔,因为 λ k \lambda_k lkThe limit of exists, set to γ \gamma γ, 对 λ k \lambda_k lkTaking the limit on both sides of the recursive expression of at the same time, we get γ = 3 ( γ − σ ) 2 σ + 3 \gamma = \dfrac{3(\gamma - \sigma)} {2\sigma + 3} c=2σ+33(γσ)Interpretation γ = − 3 σ 2 σ = − 3 2 \gamma = -\dfrac{3\sigma}{2\sigma} = -\dfrac{3} {2} c=2σ3σ=23Substitute a smooth equation lim ⁡ k → + ∞ x k = [ σ − γ 4 σ + 6 , 3 σ − 3 γ 4 σ + 6 ] T = [ 1 4 , 3 4 ] T denotes \to +\infty}{\lim} \ball symbol{x}_k= \left[ \dfrac{\sigma - \gamma}{4\sigma + 6}, \dfrac{3\sigma - 3\gamma}{4\sigma + 6}\right]^{\rm T} = \left[\dfrac{1}{4}, \dfrac{3}{4}\right ]^{\rm T}k+limxk=[4σ+6pγ,4σ+63σ3γ]T=[41,43]TMost 优值为 3 4 \dfrac{3}{4} 43

Guess you like

Origin blog.csdn.net/qq_56131580/article/details/129704563