【优化方法学习笔记】第三章:约束最优化方法

1. 约束最优化问题

1.1 约束最优化问题的一般形式

约束最优化问题的一般形式为 min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯   , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯   , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,m f ( x ) f(\boldsymbol{x}) f(x)目标函数, h i ( x ) = 0 h_i(\boldsymbol{x}) = 0 hi(x)=0等式约束, h j ( x ) ≤ 0 h_j(\boldsymbol{x}) \le 0 hj(x)0不等式约束
称集合 Ω = { x ∣ h i ( x ) = 0 , h j ( x ) ≤ 0 , i = 1 , 2 , ⋯   , l , j = l + 1 , l + 2 , ⋯   , m } \varOmega = \left \lbrace \boldsymbol{x} | h_i(\boldsymbol{x}) = 0, h_j(\boldsymbol{x}) \le 0, i = 1, 2, \cdots, l, j = l+1, l+2, \cdots, m \right \rbrace Ω={ xhi(x)=0,hj(x)0,i=1,2,,l,j=l+1,l+2,,m}可行域

1.2 可行方向与可行下降方向

d \boldsymbol{d} d为非零向量, x ∈ Ω \boldsymbol{x} \in \varOmega xΩ, 若 ∃ k > 0 \exists k > 0 k>0, 使得 ∀ α ∈ ( 0 , k ) \forall \alpha \in (0, k) α(0,k), 都有 x + α d ∈ Ω \boldsymbol{x} + \alpha \boldsymbol{d} \in \varOmega x+αdΩ, 则称向量 d \boldsymbol{d} d为点 x \boldsymbol{x} x处的可行方向, 若还满足 f ( x + α d ) < f ( x ) f(\boldsymbol{x} + \alpha \boldsymbol{d}) < f(\boldsymbol{x}) f(x+αd)<f(x), 则称 d \boldsymbol{d} d为点 x \boldsymbol{x} x处的可行下降方向(或称改进的可行方向)。

1.3 起作用指标集

对于点 x ∈ Ω \boldsymbol{x} \in \varOmega xΩ, 称集合 A ( x ) = { i ∣ h i ( x ) = 0 } A(\boldsymbol{x}) = \left \lbrace i | h_i(\boldsymbol{x}) = 0 \right \rbrace A(x)={ ihi(x)=0}为点 x \boldsymbol{x} x起作用指标集, 直观来讲, 起作用指标集就是所有等式约束的下标和所有不等式约束中取等号的下标构成的集合。

2. KKT条件

设一般形式的约束最优化问题在点 x \boldsymbol{x} x处满足:向量组 ∇ h i ( x ) \nabla h_i(\boldsymbol{x}) hi(x), i ∈ A ( x ) i \in A(\boldsymbol{x}) iA(x)线性无关, 问题在点 x \boldsymbol{x} x处的拉格朗日函数 L ( x , λ ) = f ( x ) + ∑ i = 1 m λ i h i ( x ) L(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{i=1}^{m} \lambda_i h_i(\boldsymbol{x}) L(x,λ)=f(x)+i=1mλihi(x)则问题在点 x \boldsymbol{x} x处的KKT条件 { ∇ x L ( x , λ ) = 0 h i ( x ) = 0 , i = 1 , 2 , ⋯   , l h j ( x ) ≤ 0 , j = l + 1 , l + 2 , ⋯   , m λ j h j ( x ) = 0 , j = l + 1 , l + 2 , ⋯   , m λ j ≥ 0 , j = l + 1 , l + 2 , ⋯   , m \begin{cases} \nabla_{\boldsymbol{x}} L(\boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 \\ h_i(\boldsymbol{x}) = 0, i = 1, 2, \cdots, l \\ h_j(\boldsymbol{x}) \le 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j h_j(\boldsymbol{x}) = 0, j = l + 1, l + 2, \cdots, m \\ \lambda_j \ge 0, j = l+1, l+2, \cdots, m \end{cases} xL(x,λ)=0hi(x)=0,i=1,2,,lhj(x)0,j=l+1,l+2,,mλjhj(x)=0,j=l+1,l+2,,mλj0,j=l+1,l+2,,m若点 x \boldsymbol{x} x满足KKT条件, 则称 x \boldsymbol{x} xKKT点, 相应的 ( x , λ ) (\boldsymbol{x}, \boldsymbol{\lambda}) (x,λ)称为KKT对

【例1】求下列问题的所有KKT点 min ⁡ x 1 x 2 s . t . x 1 2 + x 2 2 = 1 \begin{matrix} \min & x_1x_2 \\ \rm{s.t.} & x_1^2 + x_2^2 = 1 \end{matrix} mins.t.x1x2x12+x22=1【解】构造拉格朗日函数 L ( x 1 , x 2 , λ ) = x 1 x 2 + λ x 1 2 + λ x 2 2 − λ L(x_1, x_2, \lambda) = x_1x_2 + \lambda x_1^2 + \lambda x_2^2 - \lambda L(x1,x2,λ)=x1x2+λx12+λx22λKKT条件为 { x 2 + λ x 1 = 0 x 1 + λ x 2 = 0 x 1 2 + x 2 2 = 1 \begin{cases} x_2 + \lambda x_1 = 0 \\ x_1 + \lambda x_2 = 0 \\ x_1^2 + x_2^2 = 1 \end{cases} x2+λx1=0x1+λx2=0x12+x22=1解得 x 1 = x 2 = ± 2 2 , λ = − 1 2 x_1 = x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = -\dfrac{1}{2} x1=x2=±22 ,λ=21 x 1 = − x 2 = ± 2 2 , λ = 1 2 x_1 = -x_2 = \pm \dfrac{\sqrt2}{2}, \lambda = \dfrac{1}{2} x1=x2=±22 ,λ=21所以KKT点为 ( 2 2 , 2 2 ) T \left( \dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T, ( − 2 2 , 2 2 ) T \left( -\dfrac{\sqrt2}{2}, \dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T, ( 2 2 , − 2 2 ) T \left( \dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T, ( − 2 2 , − 2 2 ) T \left( -\dfrac{\sqrt2}{2}, -\dfrac{\sqrt2}{2} \right)^{\rm T} (22 ,22 )T

【例2】判断点 x 0 = ( 1 , 3 ) T \boldsymbol{x}_0 = (1, 3)^{\rm T} x0=(1,3)T是否为下列问题的KKT点 min ⁡ 4 x 1 − 3 x 2 s . t . x 1 + x 2 ≤ 4 x 2 + 7 ≥ 0 ( x 1 − 3 ) 2 ≤ 1 + x 2 \begin{matrix} \min & 4x_1 - 3x_2 \\ \rm{s.t.} & x_1 + x_2 \le 4 \\ & x_2 + 7 \ge 0 \\ & (x_1 - 3)^2 \le 1 + x_2 \end{matrix} mins.t.4x13x2x1+x24x2+70(x13)21+x2【解】点 x 0 \boldsymbol{x}_0 x0处的起作用指标集为 A ( x 0 ) = { 1 , 3 } A(\boldsymbol{x}_0) = \lbrace 1, 3 \rbrace A(x0)={ 1,3}, 所以 λ 2 = 0 \lambda_2 = 0 λ2=0, 构造拉格朗日函数 L ( x 1 , x 2 , λ 1 , λ 3 ) = 4 x 1 − 3 x 2 + λ 1 ( x 1 + x 2 − 4 ) + λ 3 [ ( x 1 − 3 ) 2 − x 2 − 1 ] L(x_1, x_2, \lambda_1, \lambda_3) = 4x_1 - 3x_2 + \lambda_1(x_1 + x_2 - 4) + \lambda_3\left[(x_1-3)^2 - x_2 - 1\right] L(x1,x2,λ1,λ3)=4x13x2+λ1(x1+x24)+λ3[(x13)2x21]KKT条件可化为 { 4 + λ 1 + 2 λ 3 x 1 − 6 λ 3 = 0 − 3 + λ 1 − λ 3 = 0 λ 1 ≥ 0 , λ 3 ≥ 0 \begin{cases} 4 + \lambda_1 + 2\lambda_3x_1- 6\lambda_3 = 0 \\ -3 + \lambda_1 - \lambda_3 = 0 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} 4+λ1+2λ3x16λ3=03+λ1λ3=0λ10,λ30 x 1 = 1 x_1 = 1 x1=1 x 2 = 3 x_2 = 3 x2=3代入上面的方程, 得到 { λ 1 − 4 λ 3 = − 4 λ 1 − λ 3 = 3 λ 1 ≥ 0 , λ 3 ≥ 0 \begin{cases} \lambda_1 - 4\lambda_3 = -4 \\ \lambda_1 - \lambda_3 = 3 \\ \lambda_1 \ge 0, \lambda_3 \ge 0 \end{cases} λ14λ3=4λ1λ3=3λ10,λ30上面的方程有解: λ 1 = 16 3 ≥ 0 \lambda_1 = \dfrac{16}{3} \ge 0 λ1=3160, λ 3 = 7 3 ≥ 0 \lambda_3 = \dfrac{7}{3} \ge 0 λ3=370, 所以 x 0 \boldsymbol{x}_0 x0是KKT点。

3. 二次规划

3.1 二次规划的一般形式

称目标函数为二次函数, 约束为线性约束的约束最优化问题为二次规划, 二次规划的一般形式为 min ⁡ 1 2 x T G x + c T x s . t . a i T x = b i i = 1 , 2 , ⋯   , l a j T x ≤ b j j = l + 1 , l + 2 , ⋯   , m \begin{matrix} \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{\rm T} \boldsymbol{x}\\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{x} = b_i & i = 1, 2, \cdots, l \\ & \boldsymbol{a}_j^{\rm T} \boldsymbol{x} \le b_j & j = l+1, l+2, \cdots, m \end{matrix} mins.t.21xTGx+cTxaiTx=biajTxbji=1,2,,lj=l+1,l+2,,m

3.2 等式约束二次规划

若二次规划问题不含不等式约束, 则问题退化为 min ⁡ 1 2 x T G x + c T x s . t . A x = b \begin{matrix} \min & \dfrac{1}{2}\boldsymbol{x}^{\rm T}\boldsymbol{G}\boldsymbol{x} + \boldsymbol{c}^{\rm T} \boldsymbol{x}\\ \rm {s.t.} & \boldsymbol{A} \boldsymbol{x} = \boldsymbol b \end{matrix} mins.t.21xTGx+cTxAx=b若矩阵 G \boldsymbol{G} G半正定, 且 A \boldsymbol{A} A的所有行线性无关, 则问题的KKT点与问题的最优解等价。只需求解线性方程组 [ G A T A O ] [ x λ ] = [ − c b ] \begin{bmatrix} \boldsymbol{G} & \boldsymbol{A}^{\rm T} \\ \boldsymbol{A} & \boldsymbol{O} \end{bmatrix} \begin{bmatrix} \boldsymbol{x} \\ \boldsymbol{\lambda} \end{bmatrix} = \begin{bmatrix} \boldsymbol{-c} \\ \boldsymbol{b} \end{bmatrix} [GAATO][xλ]=[cb]即可得到问题的最优解。

【例3】求解二次规划问题 min ⁡ x 1 2 + x 2 2 + x 3 2 − x 1 x 2 − x 2 x 3 + 2 x 1 − x 2 s . t . 3 x 1 − x 2 − x 3 = 0 2 x 1 − x 2 − x 3 = 0 \begin{matrix} \min & x_1^2 + x_2^2 + x_3^2 - x_1x_2 - x_2x_3 + 2x_1 - x_2\\ \rm {s.t.} & 3x_1 - x_2 - x_3 = 0 \\ & 2x_1 - x_2 - x_3 = 0 \end{matrix} mins.t.x12+x22+x32x1x2x2x3+2x1x23x1x2x3=02x1x2x3=0【解】把问题化为矩阵形式 min ⁡ 1 2 [ x 1 , x 2 , x 3 ] [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] [ x 1 x 2 x 3 ] + [ 2 , − 1 , 0 ] [ x 1 x 2 x 3 ] s . t . [ 3 − 1 − 1 2 − 1 − 1 ] [ x 1 x 2 x 3 ] = [ 0 0 ] \begin{matrix} \min & \dfrac{1}{2} [x_1, x_2, x_3] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + [2, -1, 0] \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \\ \\ \rm {s.t.} & \begin{bmatrix} 3 & -1 & -1 \\ 2 & -1 & -1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{matrix} mins.t.21[x1,x2,x3] 210121012 x1x2x3 +[2,1,0] x1x2x3 [321111] x1x2x3 =[00]其中矩阵 [ 2 − 1 0 − 1 2 − 1 0 − 1 2 ] \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix} 210121012 正定, 解线性方程组 [ 2 − 1 0 3 2 − 1 2 − 1 − 1 − 1 0 − 1 2 − 1 − 1 3 − 1 − 1 0 0 2 − 1 − 1 0 0 ] [ x 1 x 2 x 3 λ 1 λ 2 ] = [ − 2 1 0 0 0 ] \begin{bmatrix} 2 & -1 & 0 & 3 & 2 \\ -1 & 2 & -1 & -1 & -1 \\ 0 & -1 & 2 & -1 & -1 \\ 3 & -1 & -1 & 0 & 0 \\ 2 & -1 & -1 & 0 & 0 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ \lambda_1 \\ \lambda_2 \end{bmatrix} = \begin{bmatrix} -2 \\ 1 \\ 0 \\ 0 \\0 \end{bmatrix} 2103212111012113110021100 x1x2x3λ1λ2 = 21000 [ x 1 , x 2 , x 3 , λ 1 , λ 2 ] = [ 0 , 1 6 , 1 6 , − 5 6 , 1 3 ] [x_1, x_2, x_3, \lambda_1, \lambda_2] = \left[ 0, \dfrac{1}{6}, \dfrac{1}{6}, -\dfrac{5}{6}, \dfrac{1}{3} \right] [x1,x2,x3,λ1,λ2]=[0,61,61,65,31], 所以最优解为 [ x 1 , x 2 , x 3 ] = [ 0 , 1 6 , 1 6 ] [x_1, x_2, x_3] = \left[ 0, \dfrac{1}{6}, \dfrac{1}{6} \right] [x1,x2,x3]=[0,61,61], 最优值为 − 5 36 -\dfrac{5}{36} 365

3.3 起作用指标集方法

对于一般形式的二次规划问题, 若 G \boldsymbol{G} G为正定矩阵, 则以下算法可以得到问题的最优解:

  1. 给出问题的初始可行点 x \boldsymbol{x} x
  2. 初始化下标集合 I ← A ( x ) I \gets A(\boldsymbol{x}) IA(x)
  3. w h i l e    T r u e    d o \bold{while} \; \rm{True} \; \bold{do} whileTruedo
  4. \qquad 求解下面仅含等式约束的二次规划问题得到 d \boldsymbol{d} d λ \boldsymbol{\lambda} λ min ⁡ d 1 2 d T G d + ( G x + c ) T d s . t . a i T d = 0 , i ∈ I \begin{matrix} \underset{\boldsymbol{d}}{\min} & \dfrac{1}{2} \boldsymbol{d}^{\rm T} \boldsymbol{G} \boldsymbol{d} + (\boldsymbol{G} \boldsymbol{x} + \boldsymbol{c})^{\rm T} \boldsymbol{d} \\ \rm {s.t.} & \boldsymbol{a}_i^{\rm T} \boldsymbol{d} = 0, i \in I \end{matrix} dmins.t.21dTGd+(Gx+c)TdaiTd=0,iI \qquad
  5. i f    d = 0    d o \qquad \bold {if} \; \boldsymbol{d}=\bold0 \; \bold{do} ifd=0do
  6. i f    λ ≥ 0    d o \qquad \qquad \bold{if} \; \boldsymbol{\lambda} \ge \bold0 \; \bold{do} ifλ0do
  7. r e t u r n    x \qquad \qquad \qquad \bold{return} \; \boldsymbol{x} returnx
  8. e l s e \qquad \qquad \bold{else} else
  9. I ← I ∖ { arg min ⁡ λ i } \qquad \qquad \qquad I \gets I \setminus \left \lbrace \argmin \lambda_i \right \rbrace II{ argminλi}
  10. e n d \qquad \qquad \bold{end} end
  11. e l s e \qquad \bold{else} else
  12. α ← min ⁡ i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \alpha \gets \underset{i \notin I}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace αi/Imin{ aiTdbiaiTxaiTd>0}
  13. i f    α < 1    d o \qquad \qquad \bold{if} \; \alpha < 1 \; \bold{do} ifα<1do
  14. i ← arg min ⁡ i ∉ I { b i − a i T x a i T d ∣ a i T d > 0 } \qquad \qquad \qquad i \gets \underset{i \notin I}{\argmin} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace ii/Iargmin{ aiTdbiaiTxaiTd>0}
  15. I ←   I ∪ { i } \qquad \qquad \qquad I \gets \ I \cup \lbrace i \rbrace I I{ i}
  16. e l s e \qquad \qquad \bold{else} else
  17. α ← 1 \qquad \qquad \qquad \alpha \gets 1 α1
  18. e n d \qquad \qquad \bold{end} end
  19. x ← x + α d \qquad \qquad \boldsymbol{x} \gets \boldsymbol{x} + \alpha \boldsymbol{d} xx+αd
  20. e n d \qquad \bold{end} end
  21. e n d \bold{end} end

上述算法称为起作用指标集方法

【例4】求解二次规划问题 min ⁡ ( x 1 − 1 ) 2 + ( x 2 − 2 ) 2 s . t . x 1 + x 2 ≤ 1 x 1 , x 2 ≥ 0 \begin{matrix} \min & (x_1 - 1)^2 + (x_2 - 2)^2 \\ \rm {s.t.} & x_1 + x_2 \le 1 \\ & x_1, x_2 \ge 0 \end{matrix} mins.t.(x11)2+(x22)2x1+x21x1,x20【解】该问题对应的系数为 G = [ 2 0 0 2 ] , c = [ − 2 − 4 ] , a 1 = [ 1 1 ] , a 2 = [ − 1 0 ] , a 3 = [ 0 − 1 ] , b = [ 1 0 0 ] \boldsymbol{G} = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}, \boldsymbol{c} = \begin{bmatrix} -2 \\ -4 \end{bmatrix}, \boldsymbol{a}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \boldsymbol{a}_2 = \begin{bmatrix} -1 \\ 0 \end{bmatrix}, \boldsymbol{a}_3 = \begin{bmatrix} 0 \\ -1 \end{bmatrix}, \boldsymbol{b} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} G=[2002],c=[24],a1=[11],a2=[10],a3=[01],b= 100 显然, x 0 = [ 0 , 0 ] T \boldsymbol{x}_0 = [0, 0]^{\rm T} x0=[0,0]T是可行的, 初始化 I 0 = A ( 0 , 0 ) = { 2 , 3 } I_0 = A(0, 0) = \lbrace 2, 3 \rbrace I0=A(0,0)={ 2,3}

1 1 1次迭代:
解二次规划问题 min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . − d 1 = 0 − d 2 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & -d_1= 0 \\ & -d_2 = 0 \end{matrix} mins.t.d12+d222d14d2d1=0d2=0 d = [ 0 , 0 ] T \boldsymbol{d} = [0, 0]^{\rm T} d=[0,0]T, λ = [ 0 , − 2 , − 4 ] T ≤ 0 \boldsymbol \lambda = [0, -2, -4]^{\rm T} \le \bold 0 λ=[0,2,4]T0, arg min ⁡ λ i = 3 \argmin \lambda_i = 3 argminλi=3, 更新 I 1 = { 2 } I_1 = \lbrace 2 \rbrace I1={ 2}

2 2 2次迭代:
解二次规划问题 min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . − d 1 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & -d_1= 0 \end{matrix} mins.t.d12+d222d14d2d1=0 d = [ 0 , 2 ] T ≠ 0 \boldsymbol{d} = [0, 2]^{\rm T} \ne \bold 0 d=[0,2]T=0, 则计算 α = min ⁡ i ∈ { 1 , 3 } { b i − a i T x 0 a i T d ∣ a i T d > 0 } = 1 2 < 1 \alpha = \underset{i \in \lbrace 1, 3 \rbrace}{\min} \left \lbrace \dfrac{b_i - \boldsymbol{a}_i^{\rm T} \boldsymbol{x}_0}{\boldsymbol{a}_i^{\rm T} \boldsymbol{d}} \mid \boldsymbol{a}_i^{\rm T} \boldsymbol{d} > 0 \right \rbrace = \dfrac{1}{2} < 1 α=i{ 1,3}min{ aiTdbiaiTx0aiTd>0}=21<1, i = 1 i = 1 i=1, 更新 I 2 = { 1 , 2 } I_2 = \lbrace 1, 2 \rbrace I2={ 1,2}, x 2 = [ 0 , 1 ] T \boldsymbol{x}_2 = [0, 1]^{\rm T} x2=[0,1]T

3 3 3次迭代:
解二次规划问题 min ⁡ d 1 2 + d 2 2 − 2 d 1 − 4 d 2 s . t . d 1 + d 2 = 0 − d 1 = 0 \begin{matrix} \min & d_1^2 + d_2^2 - 2d_1 - 4d_2 \\ \rm {s.t.} & d_1 + d_2 = 0 \\ & -d_1 = 0 \end{matrix} mins.t.d12+d222d14d2d1+d2=0d1=0 d = [ 0 , 0 ] T \boldsymbol{d} = [0, 0]^{\rm T} d=[0,0]T, λ = [ 2 , 0 , 0 ] T ≥ 0 \boldsymbol \lambda = [2, 0, 0]^{\rm T} \ge \bold0 λ=[2,0,0]T0, 迭代结束

所以最优解为 [ 0 , 1 ] T [0, 1]^{\rm T} [0,1]T, 最优值为 1 1 1

4. 惩罚函数法与障碍函数法

4.1 惩罚函数法

对于一般形式的约束最优化问题, 惩罚函数法通过添加惩罚项的方式将问题转化为无约束最优化问题, 从而可以使用无约束最优化方法求解原问题。对于一般形式的约束最优化问题 min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯   , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯   , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,m惩罚函数法的求解步骤是:

  1. 给定 ρ 1 > 0 \rho_1 > 0 ρ1>0, 精度 ε > 0 \varepsilon > 0 ε>0, 初始点 x 0 \boldsymbol{x}_0 x0, 当前迭代次数 k = 1 k = 1 k=1
  2. k k k次迭代, 求解无约束优化问题 min ⁡ P ( x , ρ k ) = f ( x ) + ρ k [ ∑ i = 1 l h i 2 ( x ) + ∑ j = l + 1 m ( max ⁡ { 0 , h j ( x ) } ) 2 ] \min P(\boldsymbol{x}, \rho_k) = f(\boldsymbol{x}) + \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \sum_{j=l+1}^{m} (\max \left \lbrace 0, h_j(\boldsymbol{x}) \right \rbrace)^2 \right] minP(x,ρk)=f(x)+ρk i=1lhi2(x)+j=l+1m(max{ 0,hj(x)})2 得到最优解为 x k \boldsymbol{x}_k xk
  3. 若惩罚项满足 ρ k [ ∑ i = 1 l h i 2 ( x k ) + ∑ j = l + 1 m ( max ⁡ { 0 , h j ( x k ) } ) 2 ] ≤ ε \rho_k \left[ \sum_{i=1}^{l}h_i^2(\boldsymbol{x}_k) + \sum_{j=l+1}^{m} (\max \left \lbrace 0, h_j(\boldsymbol{x}_k) \right \rbrace)^2 \right] \le \varepsilon ρk i=1lhi2(xk)+j=l+1m(max{ 0,hj(xk)})2 ε则迭代结束, 最优解为 x k \boldsymbol{x}_k xk, 否则取 ρ k + 1 > ρ k \rho_{k+1} > \rho_k ρk+1>ρk, 继续迭代

通常来讲, 做题时, 迭代一次后令 ρ → + ∞ \rho \to +\infty ρ+即可得到最优解。

【例5】用惩罚函数法求解 min ⁡ x 1 2 + x 2 2 s . t . x 1 − 1 ≥ 0 x 1 + x 2 = 3 \begin{matrix} \min & x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 - 1 \ge 0 \\ & x_1 + x_2 = 3 \end{matrix} mins.t.x12+x22x110x1+x2=3【解】令 P ( x 1 , x 2 , ρ ) = x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( max ⁡ { 0 , 1 − x 1 ] } ) 2 = { x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 , x 1 > 1 x 1 2 + x 2 2 + ρ ( x 1 + x 2 − 3 ) 2 + ρ ( 1 − x 1 ) 2 , x 1 ≤ 1 \begin{align} P(x_1, x_2, \rho) & = x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho \left( \max \lbrace 0, 1 - x_1] \rbrace \right)^2 \nonumber \\ & = \begin{cases} x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2, & x_1 > 1 \\ x_1^2 + x_2^2 + \rho \left( x_1 + x_2 - 3 \right)^2 + \rho(1 - x_1)^2, & x_1 \le 1 \end{cases} \nonumber \end{align} P(x1,x2,ρ)=x12+x22+ρ(x1+x23)2+ρ(max{ 0,1x1]})2={ x12+x22+ρ(x1+x23)2,x12+x22+ρ(x1+x23)2+ρ(1x1)2,x1>1x11 ∂ P ∂ x 1 = { 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) , x 1 > 1 2 x 1 + 2 ρ ( x 1 + x 2 − 3 ) − 2 ρ ( 1 − x 1 ) , x 1 ≤ 1 \dfrac{\partial P}{\partial x_1} = \begin{cases} 2x_1 + 2\rho ( x_1 + x_2 - 3), & x_1 > 1 \\ 2x_1 + 2\rho ( x_1 + x_2 - 3) - 2\rho(1 - x_1), & x_1 \le 1 \end{cases} x1P={ 2x1+2ρ(x1+x23),2x1+2ρ(x1+x23)2ρ(1x1),x1>1x11 ∂ P ∂ x 2 = 2 x 2 + 2 ρ ( x 1 + x 2 − 3 ) \dfrac{\partial P}{\partial x_2} = 2x_2 + 2\rho ( x_1 + x_2 - 3) x2P=2x2+2ρ(x1+x23) ∂ P ∂ x 1 = ∂ P ∂ x 2 = 0 \dfrac{\partial P}{\partial x_1} = \dfrac{\partial P}{\partial x_2} = 0 x1P=x2P=0解得 x = { ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T , ρ > 1 ρ ρ 2 + 3 ρ + 1 ( ρ + 4 , 2 ρ + 3 ) T , 0 < ρ ≤ 1 \boldsymbol{x} = \begin{cases} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T}, & \rho > 1 \\ \dfrac{\rho}{\rho^2 + 3\rho + 1} \left( \rho + 4, 2\rho + 3 \right)^{\rm T}, & 0 < \rho \le 1 \end{cases} x= (2ρ+13ρ,2ρ+13ρ)T,ρ2+3ρ+1ρ(ρ+4,2ρ+3)T,ρ>10<ρ1所以原问题的最优解为 lim ⁡ ρ → + ∞ x = lim ⁡ ρ → + ∞ ( 3 ρ 2 ρ + 1 , 3 ρ 2 ρ + 1 ) T = ( 3 2 , 3 2 ) T \underset{\rho \to +\infty}{\lim} \boldsymbol{x} = \underset{\rho \to +\infty}{\lim} \left( \dfrac{3\rho}{2\rho + 1}, \dfrac{3\rho}{2\rho + 1} \right)^{\rm T} = \left( \dfrac{3}{2}, \dfrac{3}{2} \right)^{\rm T} ρ+limx=ρ+lim(2ρ+13ρ,2ρ+13ρ)T=(23,23)T最优值为 9 2 \dfrac{9}{2} 29

4.2 障碍函数法

对于仅含不等式约束的约束优化问题 min ⁡ f ( x ) s . t . h i ( x ) ≤ 0 i = 1 , 2 , ⋯   , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) \le 0 & i = 1, 2, \cdots, m \end{matrix} mins.t.f(x)hi(x)0i=1,2,,m障碍函数法通过构造障碍函数 b ( x ) b(\boldsymbol{x}) b(x)将其转化为无约束优化问题, 障碍函数 b ( x ) b(\boldsymbol{x}) b(x)通常有以下两种构造方法: b 1 ( x ) = − ∑ i = 1 m 1 h i ( x ) , b 2 ( x ) = − ∑ i = 1 m ln ⁡ [ − h i ( x ) ] b_1(\boldsymbol{x}) = -\sum_{i=1}^{m}\dfrac{1}{h_i(\boldsymbol{x})}, b_2(\boldsymbol{x}) = -\sum_{i=1}^{m}\ln \left[-h_i(\boldsymbol{x}) \right] b1(x)=i=1mhi(x)1,b2(x)=i=1mln[hi(x)] b 1 ( x ) b_1(\boldsymbol{x}) b1(x)为倒数障碍函数, b 2 ( x ) b_2(\boldsymbol{x}) b2(x)为对数障碍函数。障碍函数法的求解步骤是

  1. 给定 r 1 > 0 r_1 > 0 r1>0, 精度 ε > 0 \varepsilon > 0 ε>0, 初始点 x 0 \boldsymbol{x}_0 x0, 当前迭代次数 k = 1 k = 1 k=1
  2. k k k次迭代, 求解无约束优化问题 min ⁡ B ( x , r k ) = f ( x ) + r k b ( x ) \min B(\boldsymbol{x}, r_k) = f(\boldsymbol{x}) + r_kb(\boldsymbol{x}) minB(x,rk)=f(x)+rkb(x)得到最优解为 x k \boldsymbol{x}_k xk
  3. 若惩罚项满足 r k b ( x k ) ≤ ε r_kb(\boldsymbol{x}_k) \le \varepsilon rkb(xk)ε则迭代结束, 最优解为 x k \boldsymbol{x}_k xk, 否则取 r k + 1 ∈ ( 0 , r k ) r_{k+1} \in (0, r_k) rk+1(0,rk), 继续迭代

通常来讲, 做题时, 迭代一次后令 r → 0 + r \to 0^+ r0+即可得到最优解。

【例6】用障碍函数法求解 min ⁡ x 1 2 + x 2 2 s . t . x 1 − x 2 + 1 ≤ 0 \begin{matrix} \min & x_1^2 + x_2^2\\ \rm {s.t.} & x_1 - x_2 + 1 \le 0 \end{matrix} mins.t.x12+x22x1x2+10【解】令 B ( x 1 , x 2 , r ) = x 1 2 + x 2 2 − r ln ⁡ ( x 2 − x 1 − 1 ) B(x_1, x_2, r) = x_1^2 + x_2^2 - r\ln(x_2 - x_1 - 1) B(x1,x2,r)=x12+x22rln(x2x11) ∂ B ∂ x 1 = 2 x 1 + r x 2 + x 1 − 1 , ∂ B ∂ x 2 = 2 x 2 − r x 2 + x 1 − 1 \dfrac{\partial B}{\partial x_1} = 2x_1 + \dfrac{r}{x_2 + x_1 - 1}, \dfrac{\partial B}{\partial x_2} = 2x_2 - \dfrac{r}{x_2 + x_1 - 1} x1B=2x1+x2+x11r,x2B=2x2x2+x11r ∂ B ∂ x 1 = ∂ B ∂ x 2 = 0 \dfrac{\partial B}{\partial x_1} = \dfrac{\partial B}{\partial x_2} = 0 x1B=x2B=0解得 x = ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T \boldsymbol{x} = \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} x=(41+1+r ,41+1+r )T所以原问题的最优解为 lim ⁡ r → 0 + x = lim ⁡ r → 0 ( − 1 + 1 + r 4 , 1 + 1 + r 4 ) T = ( − 1 2 , 1 2 ) T \underset{r \to 0^+}{\lim} \boldsymbol{x} = \underset{r \to 0}{\lim} \left( -\dfrac{1+\sqrt{1 + r}}{4}, \dfrac{1+\sqrt{1 + r}}{4} \right)^{\rm T} = \left( -\dfrac{1}{2}, \dfrac{1}{2} \right)^{\rm T} r0+limx=r0lim(41+1+r ,41+1+r )T=(21,21)T最优值为 1 2 \dfrac{1}{2} 21

4.3 混合罚函数法

混合罚函数法综合使用惩罚函数和障碍函数, 目标函数为 F ( x , r ) = f ( x ) + + r b ( x ) + p ( x ) r F(\boldsymbol{x}, r) = f(x) + + rb(\boldsymbol{x}) + \dfrac{p(\boldsymbol{x})}{r} F(x,r)=f(x)++rb(x)+rp(x)式中, b ( x ) b(\boldsymbol{x}) b(x)为障碍函数, p ( x ) p(\boldsymbol{x}) p(x)为惩罚函数。

5. 增广拉格朗日函数法

对于一般形式的约束最优化问题 min ⁡ f ( x ) s . t . h i ( x ) = 0 i = 1 , 2 , ⋯   , l h j ( x ) ≤ 0 j = l + 1 , l + 2 , ⋯   , m \begin{matrix} \min & f(\boldsymbol{x}) \\ \rm {s.t.} & h_i(\boldsymbol{x}) = 0 & i = 1, 2, \cdots, l \\ & h_j(\boldsymbol{x}) \le 0 & j = l+1, l+2, \cdots, m \end{matrix} mins.t.f(x)hi(x)=0hj(x)0i=1,2,,lj=l+1,l+2,,m定义其增广拉格朗日函数 L σ ( x , λ ) = f ( x ) + ∑ i = 1 l λ i h i ( x ) + σ 2 ∑ i = 1 l h i 2 ( x ) + 1 2 σ ∑ j = l + 1 m { [ max ⁡ { 0 , λ j + σ h j ( x ) } ] 2 − λ j 2 } L_\sigma(\boldsymbol{x}, \boldsymbol{\lambda}) = f(\boldsymbol{x}) + \sum_{i=1}^{l}\lambda_i h_i(\boldsymbol{x}) + \dfrac{\sigma}{2}\sum_{i=1}^{l}h_i^2(\boldsymbol{x}) + \dfrac{1}{2\sigma} \sum_{j=l+1}^{m} \left \lbrace \left[ \max \left \lbrace 0, \lambda_j + \sigma h_j(\boldsymbol{x}) \right \rbrace \right]^2 - \lambda_j^2 \right \rbrace Lσ(x,λ)=f(x)+i=1lλihi(x)+2σi=1lhi2(x)+2σ1j=l+1m{ [max{ 0,λj+σhj(x)}]2λj2} k k k次迭代, 令 ∇ x L ( x , λ ) = 0 \nabla_{\boldsymbol{x}}L(\boldsymbol{x}, \boldsymbol{\lambda}) = \bold0 xL(x,λ)=0解得 x k \boldsymbol{x}_k xk, 然后按以下公式修正拉格朗日系数 ( λ k + 1 ) i = { ( λ k ) i + σ h i ( x k ) , 1 ≤ i ≤ l max ⁡ { 0 , ( λ k ) i + σ h i ( x k ) } , l < i ≤ m (\boldsymbol{\lambda}_{k+1})_i = \begin{cases} (\boldsymbol{\lambda}_k)_i + \sigma h_i(\boldsymbol{x}_k), & 1 \le i \le l \\ \max \lbrace 0, (\boldsymbol{\lambda}_k)_i + \sigma h_i(\boldsymbol{x}_k) \rbrace, & l < i \le m \end{cases} (λk+1)i={ (λk)i+σhi(xk),max{ 0,(λk)i+σhi(xk)},1ill<im通常来讲, 做题时, 迭代一次, 然后计算 λ k \boldsymbol{\lambda}_k λk的极限, 将极限值代入即可得到最优解。

【例7】用增广拉格朗日函数法求解 min ⁡ 3 x 1 2 + x 2 2 s . t . x 1 + x 2 = 1 \begin{matrix} \min & 3x_1^2 + x_2^2 \\ \rm {s.t.} & x_1 + x_2 = 1 \end{matrix} mins.t.3x12+x22x1+x2=1【解】令 L σ ( x 1 , x 2 , λ ) = 3 x 1 2 + x 2 2 + λ ( x 1 + x 2 − 1 ) + σ 2 ( x 1 + x 2 − 1 ) 2 L_\sigma(x_1, x_2, \lambda) = 3x_1^2 + x_2^2 + \lambda(x_1 + x_2 - 1) + \dfrac{\sigma}{2}(x_1 + x_2 - 1)^2 Lσ(x1,x2,λ)=3x12+x22+λ(x1+x21)+2σ(x1+x21)2 ∇ x L σ ( x 1 , x 2 , λ ) = [ 6 x 1 + λ + σ ( x 1 + x 2 − 1 ) , 2 x 2 + λ + σ ( x 1 + x 2 − 1 ) ] T \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = [6x_1 + \lambda + \sigma(x_1 + x_2 - 1), 2x_2 + \lambda + \sigma(x_1 + x_2 - 1)]^{\rm T} xLσ(x1,x2,λ)=[6x1+λ+σ(x1+x21),2x2+λ+σ(x1+x21)]T ∇ x L σ ( x 1 , x 2 , λ ) = 0 \nabla_{\boldsymbol{x}}L_\sigma(x_1, x_2, \lambda) = \bold0 xLσ(x1,x2,λ)=0解得 x k = [ σ − λ k 4 σ + 6 , 3 σ − 3 λ k 4 σ + 6 ] T \boldsymbol{x}_k = \left[ \dfrac{\sigma - \lambda_k}{4\sigma + 6}, \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} \right]^{\rm T} xk=[4σ+6σλk,4σ+63σ3λk]T从而 λ k + 1 = λ k + σ ( σ − λ k 4 σ + 6 + 3 σ − 3 λ k 4 σ + 6 − 1 ) = 3 ( λ k − σ ) 2 σ + 3 \lambda_{k+1} = \lambda_k + \sigma \left( \dfrac{\sigma - \lambda_k}{4\sigma + 6} + \dfrac{3\sigma - 3\lambda_k}{4\sigma + 6} - 1 \right) = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} λk+1=λk+σ(4σ+6σλk+4σ+63σ3λk1)=2σ+33(λkσ) λ 1 > − 3 2 \lambda_1 > -\dfrac{3}{2} λ1>23时, 由数学归纳法易证 λ k > − 3 2 \lambda_k > -\dfrac{3}{2} λk>23, 于是 λ k + 1 − λ k = 3 ( λ k − σ ) 2 σ + 3 − λ k = − σ 2 σ + 3 ( 3 + 2 λ k ) < 0 \lambda_{k+1} - \lambda_k = \dfrac{3(\lambda_k - \sigma)}{2\sigma + 3} - \lambda_k = -\dfrac{\sigma}{2\sigma + 3}(3 + 2\lambda_k) < 0 λk+1λk=2σ+33(λkσ)λk=2σ+3σ(3+2λk)<0即数列 { λ k } \lbrace \lambda_k \rbrace { λk}单调递减且有界, 故 λ k \lambda_k λk的极限存在, 设为 γ \gamma γ, 对 λ k \lambda_k λk的递推式两边同时取极限得 γ = 3 ( γ − σ ) 2 σ + 3 \gamma = \dfrac{3(\gamma - \sigma)}{2\sigma + 3} γ=2σ+33(γσ)解得 γ = − 3 σ 2 σ = − 3 2 \gamma = -\dfrac{3\sigma}{2\sigma} = -\dfrac{3}{2} γ=2σ3σ=23所以原问题的最优解为 lim ⁡ k → + ∞ x k = [ σ − γ 4 σ + 6 , 3 σ − 3 γ 4 σ + 6 ] T = [ 1 4 , 3 4 ] T \underset{k \to +\infty}{\lim} \boldsymbol{x}_k= \left[ \dfrac{\sigma - \gamma}{4\sigma + 6}, \dfrac{3\sigma - 3\gamma}{4\sigma + 6} \right]^{\rm T} = \left[ \dfrac{1}{4}, \dfrac{3}{4} \right]^{\rm T} k+limxk=[4σ+6σγ,4σ+63σ3γ]T=[41,43]T最优值为 3 4 \dfrac{3}{4} 43

猜你喜欢

转载自blog.csdn.net/qq_56131580/article/details/129704563
今日推荐