增广拉格朗日函数法(ALM)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/itnerd/article/details/86012869

增广拉格朗日函数法( Augmented Lagrangian method)

一、等式约束

考虑问题:
min x    f ( x ) s . t .    c i ( x ) = 0 , i = 1 ,   , m . \min_x \;f(x)\\s.t. \;c_i(x) = 0, \quad i=1,\cdots,m.
定义增广拉格朗日函数:
L t ( x , λ ) = f ( x ) i λ i c i ( x ) + t 2 i ( c i ( x ) ) 2 L_t(x,\lambda) = f(x) - \sum_i \lambda_ic_i(x) + \frac{t}{2}\sum_i\big(c_i(x)\big)^2
算法迭代步骤为:

  • 固定 λ \lambda , 更新x: x + = a r g m i n x L t ( x ; λ ) x^+ = \mathop{argmin}_x L_t(x;\lambda) 意味着 x L t ( x + ; λ ) = f ( x + ) i ( λ i t c i ( x + ) ) c i ( x + ) = 0 \nabla_x L_t(x^+;\lambda) = \nabla f(x^+) - \sum_i\big( \lambda_i-tc_i(x^+)\big)\nabla c_i(x^+) = 0
  • 更新 λ \lambda : λ i + = λ i t c i ( x + ) \lambda_i^+ = \lambda_i-tc_i(x^+)

二、不等式约束

考虑问题:
min x    f ( x ) s . t .    c i ( x ) 0 , i = 1 ,   , m . \min_x \;f(x)\\s.t. \;c_i(x) \geq 0, \quad i=1,\cdots,m.
其等价形式为:
min x    f ( x ) s . t .    c i ( x ) ν i = 0 , ν i 0 , i = 1 ,   , m . \min_x \;f(x)\\s.t. \;c_i(x) - \nu_i =0, \quad \nu_i \geq 0,\quad i=1,\cdots,m.
定义带约束的增广拉格朗日函数:
L t ( x , λ ) = f ( x ) i λ i ( c i ( x ) ν i ( x ) ) + t 2 i ( c i ( x ) ν i ( x ) ) 2 s . t . ν i 0 , i = 1 ,   , m . L_t(x,\lambda) = f(x) - \sum_i \lambda_i \big(c_i(x)-\nu_i(x)\big) + \frac{t}{2}\sum_i\big(c_i(x)-\nu_i(x)\big)^2 \\ s.t. \quad \nu_i \geq 0,\quad i=1,\cdots,m.
算法迭代步骤为:

  • 固定 λ \lambda , 更新 x , ν x,\nu (1) ( x + , ν + ) = a r g min x , ν L t ( x ; λ ) = a r g min x , ν f ( x ) + i { λ i ( c i ( x ) ν i ( x ) ) + t 2 ( c i ( x ) ν i ( x ) ) 2 } s . t . ν i 0 , i = 1 ,   , m . (x^+,\nu^+) = arg\min_{x,\nu} \quad L_t(x;\lambda) \\ = arg\min_{x,\nu}\quad f(x) + \sum_i \bigg\{ -\lambda_i \big(c_i(x)-\nu_i(x)\big) + \frac{t}{2}\big(c_i(x)-\nu_i(x)\big)^2 \bigg\} \tag{1}\\ s.t. \quad \nu_i \geq 0,\quad i=1,\cdots,m.
  • 更新 λ \lambda : λ i + = λ i t ( c i ( x + ) ν i + ) \lambda_i^+ = \lambda_i-t(c_i(x^+)-\nu_i^+)

事实上,算法中的 ν \nu 可以消去,由(1)式 (2) ( x + , ν + ) = a r g min x , ν f ( x ) + i { λ i ( c i ( x ) ν i ( x ) ) + t 2 ( c i ( x ) ν i ( x ) ) 2 } = a r g min x , ν f ( x ) + t 2 i { ( λ i t ) 2 + ( c i ( x ) ν i ( x ) λ i t ) 2 } = a r g min x , ν f ( x ) + t 2 i { ( c i ( x ) ν i ( x ) λ i t ) 2 } s . t . ν i 0 , i = 1 ,   , m . (x^+,\nu^+) = arg\min_{x,\nu}\quad f(x) + \sum_i \bigg\{ -\lambda_i \big(c_i(x)-\nu_i(x)\big) + \frac{t}{2}\big(c_i(x)-\nu_i(x)\big)^2 \bigg\} \\ = arg\min_{x,\nu}\quad f(x) + \frac{t}{2}\sum_i \bigg\{ -(\frac{\lambda_i}{t})^2 + \big(c_i(x)-\nu_i(x) - \frac{\lambda_i}{t}\big)^2 \bigg\} \\= arg\min_{x,\nu} \quad f(x) + \frac{t}{2}\sum_i \bigg\{ \big(c_i(x)-\nu_i(x) - \frac{\lambda_i}{t}\big)^2 \bigg\} \\ s.t. \quad \nu_i \geq 0,\quad i=1,\cdots,m. \tag{2}
从(2)式第二项很容易看出,假如先求得 x + x^+ ,必然有 ν i + = m a x ( c i ( x + ) λ i t , 0 ) \nu_i^+ = max(c_i(x^+) - \frac{\lambda_i}{t},0) 上式中取 max 是为了满足 ν \nu 非负的约束条件。将其代回 (1) 式,得 x + = a r g min x f ( x ) + i ψ ( c i ( x ) , λ i , t ) x^+ = arg\min_x \quad f(x) + \sum_i \psi(c_i(x),\lambda_i,t) 其中 ψ ( c i ( x ) , λ i , t ) = { λ i c i ( x ) + t 2 c i ( x ) 2          c i ( x ) λ i / t < 0 λ i 2 2 t ,                o t h e r w i s e \psi(c_i(x),\lambda_i,t)=\left\{ \begin{array}{lr} -\lambda_i c_i(x) + \frac{t}{2}c_i(x)^2 \;\;\;\; 如果 c_i(x) - \lambda_i/t <0& \\ -\frac{\lambda_i^2}{2t}, \;\;\;\;\;\;\;otherwise& \end{array} \right.
然后更新 λ \lambda : λ + = m a x ( λ i t c i ( x + ) , 0 ) \lambda^+ = max(\lambda_i - tc_i(x^+),0)

猜你喜欢

转载自blog.csdn.net/itnerd/article/details/86012869