A-08 Lagrangian duality

Newer and more full of "machine learning" to update the site, more python, go, data structures and algorithms, reptiles, artificial intelligence teaching waiting for you: https://www.cnblogs.com/nickchen121/

Lagrange Duality

In the constrained optimization problem, the Lagrangian can convert duality (Lagrange duality) original problem for the dual problem, then the solution obtained by solving the original problem solution to the dual problem.

First, the original problem

1.1 constrained optimization problem

Suppose \ (f (x), c_i (x), h_j (x) \) is defined in the (R ^ n \) \ continuously differentiable function on, the constraints of the original problem optimization problem is
\ [\ begin { align} & \ underbrace {min} _ {x \ in {R ^ n}} f (x) \\ & st \, c_i (x) \ leq0, \ quad {i = 1,2, \ cdots, k} \\ & h_j (x) = 0
, \ quad {j = 1,2, \ cdots, l} \ end {align} \] without considering the constraints, constraint problem is
\ [\ underbrace {min} _ {x \ in {R ^ n}}
f (x) \] since it has been assumed \ (f (x), c_i (x), h_j (x) \) continuously differentiable, directly (f (x) \) \ seek take the lead 0, you can find the optimal solution, but there are constraints, and therefore need to find ways to remove constraints and Lagrange function is precisely doing that.

1.2 Generalized Lagrangian

To solve the original problem, the introduction of generalized Lagrangian (function Generalized LAGRANGE)
\ [L (X, \ Alpha, \ Beta) = F (X) + \ sum_ = {I}. 1 ^ K \ alpha_ic_i (X) + \ sum_ {j = 1}
^ l \ beta_jh_j (x) \] where \ (x = (x ^ { (1)}, x ^ {(2)}, \ cdots, x ^ {(n)}) T ^ \ ^ R & lt n-in {} \) , \ (\ alpha_i \ geq0, \ beta_j \) is the Lagrange multiplier.

If \ (L (x, \ alpha , \ beta) \) considered on \ (\ alpha_i, \ beta_j \ ) function, find its maximum, i.e.
\ [\ underbrace {max} _ {\ alpha, \ beta} L (x, \
alpha, \ beta) \] Since \ (\ alpha_i, \ beta_j \ ) as the Lagrange multipliers have found that it is possible to \ (L (x, \ alpha , \ beta) \) seen on \ (X \) function
\ [\ theta_P (x) =
\ underbrace {max} _ {\ alpha, \ beta} L (x, \ alpha, \ beta) \] where the subscript \ (P \) represents the original problem.

Consider 1.3 constraints

Suppose you are given a \ (the X-\) .

  1. If \ (X \) violates constraints of the original problem, i.e. the presence of a \ (I \) such \ (c_i (w)> 0 \) , or the presence of a \ (J \) such \ (h_j (w) \ neq0 \) , there are
    \ [\ theta_P {(x) } = \ underbrace {max} _ {\ alpha, \ beta: \ alpha_i \ leq0} [f (x) + \ sum_ {i = 1} ^ k \ alpha_ic_i (x) + \ sum_ {
    i = 1} ^ l \ beta_jh_j (x)] = + \ infty \] because if a \ (I \) such constraints \ (C_i (X)> 0 \) , then can make \ (\ alpha_i \ rightarrow + {\ infty} \) ; if a \ (J \) such that \ (h_j (X) \ neq0 \) , may be such that \ (\ beta_jh_j (x) \ rightarrow {+ \ infty} \) .
  2. If \ (X \) constraints of the original problem is satisfied, \ (h_j (X) = 0 \) and \ (\ alpha_ic_i (X) \ leq0 \) , so \ (\ theta_P {(x) } \) Maximum value is the \ (F (X) \) , i.e., \ (\ theta_P {(X)} = F (X) \) .

By consideration of constraints to get
\ [\ theta_P {(x) } = \ begin {cases} f (x), & \ text {$ x $ satisfy the constraints} \\ + \ infty, & \ text { other} \ end {cases} \]
Therefore, in consideration of the minimization problem
\ [\ underbrace {min} _x \ theta_P {(x)} _ {x} = \ underbrace {min} _ {x} \, \ underbrace { max} _ {\ alpha, \
beta} L (x, \ alpha, \ beta) = \ underbrace {min} _ {x} f (x) \] it is equivalent to the original problem, where \ (\ underbrace {min} _ {x} \ , \ underbrace {max} _ {\ alpha, \ beta} L (x, \ alpha, \ beta) \) is called Generalized Lagrangian minimax problem.

Generalized by Minimax Problem Lagrangian function can be defined optimal value of the original problem
\ [p ^ * = \ underbrace
{min} _x \ theta_P (x) \] This section by using Lagrange day function to the original constraint problem into unconstrained problem, the issue is about the constraints of unconstrained.

Second, the dual problem

A definition of \ (\ alpha, \ beta \ ) function
\ [\ theta_D (\ alpha,
\ beta) = \ underbrace {min} _xL (x, \ alpha, \ beta) \] where the right side of the equation is about \ (X \) minimizing a function, i.e., to determine the \ (X \) values, and only the minimum \ (\ alpha, \ beta \ ) related.
If maximization \ (\ theta_D (\ Alpha, \ Beta) \) , i.e.
\ [\ underbrace {max} _ {\ alpha, \ beta} \ theta_D (\ alpha, \ beta) = \ underbrace {max} _ {\ alpha, \ beta} \
underbrace {min} _ {x} L (x, \ alpha, \ beta) \] above is the original problem dual problem, where \ (\ underbrace {max} _ {\ alpha, \ beta} \ underbrace {min} _ {x} L (x, \ alpha, \ beta) \) also called generalized Lagrangian minimax problem.

Original problem with this dual problem of
\ [\ underbrace {min} _x \ theta_P {(x)} _ {x} = \ underbrace {min} _ {x} \, \ underbrace {max} _ {\ alpha, \ beta } L (x, \ alpha,
\ beta) \] original problem was first fixed \ (L (x, \ alpha , \ beta) \) of \ (X \) , optimization of the parameters \ (\ alpha, \ beta \) , then optimization \ (X \) ; dual problem is fixed to \ (\ Alpha, \ Beta \) , the optimization \ (X \) , and then determines \ (\ Alpha, \ Beta \) .
Optimal dual problem is
\ [d ^ * = \ underbrace {max} _ {\ alpha, \ beta} \ theta_D (\ alpha, \ beta) \]

Third, the relationship between the original problem and the dual problem

3.1 Theorem 1

If the original problem and has optimal solution to the dual problem,
\ [d ^ * = \ underbrace {max} _ {\ alpha, \ beta} \ underbrace {min} _xL (x, \ alpha, \ beta) \ leq \ underbrace {min} _x \ underbrace {
max} _ {\ alpha, \ beta} L (x, \ alpha, \ beta) = p ^ * \] because any \ (\ Alpha, \ Beta, X \) , both
\ [\ theta_D (\ alpha, \ beta) = \ underbrace {min} _xL (x, \ alpha, \ beta) \ leq {L (x, \ alpha, \ beta)} \ leq \ underbrace {max} _ {\ alpha, \ beta}
L (x, \ alpha, \ beta) = \ theta_P (x) \] i.e.
\ [theta_D (\ alpha, \
beta) \ leq \ theta_P (x) \] Since the original problem and dual problem has an optimum value, so
\ [\ underbrace {max} _
{\ alpha, \ beta} \ theta_D (\ alpha, \ beta) \ leq \ underbrace {min} _x \ theta_P (x) \] i.e.
\ [d ^ * = \ underbrace { max} _ {\ alpha, \ beta} \ underbrace {min} _xL (x, \ alpha, \ beta) \ leq \ underbrace {min} _x \ underbrace {max} _ {\ alpha , \ beta} L (x, \ alpha, \ beta) = p ^ * \]
The description above the optimal value of the original problem is not less than the optimal value of the dual problem, but we want to solve the original problem by the dual problem, it is necessary to get the best value equal to the original issue of the optimal value of the dual problem.

3.2 Corollary 1

1 can be introduced by Theorem: Suppose \ (x ^ *, \ alpha ^ *, \ beta ^ * \) are the original problem and viable solution to the dual problem, if \ (D ^ * = P ^ * \) , then \ (x ^ *, \ alpha ^ *, \ beta ^ * \) are the original problem and the optimal solution of the dual problem.

When the optimal value is equal to the original problem and the dual problem \ (D = P * ^ * ^ \) , using simpler than if the original problem solving the dual problem, the original problem can be solved by the dual problem.

3.3 Theorem 2

For the original problem and the dual problem, assuming the function \ (f (x) \) and \ (c_i (x) \) is convex, \ (h_j (X) \) is an affine function (NOTE: affine function is a function order polynomial configuration, \ (F (X) = Ax of + B \) , \ (a \) is a matrix, \ (X, B \) is a vector); and assuming inequality constraints \ (c_i (x) \) is strictly feasible, i.e., the presence \ (X \) , for all \ (I \) have \ (C_i (X) <0 \) , then there is \ (x ^ *, \ alpha ^ *, \ beta ^ * \) , so \ (x ^ * \) is the solution of the original problem, \ (\ Alpha ^ *, \ Beta ^ * \) is the solution to the dual problem, and there will be
\ [p ^ * = d ^ * = L (x ^ *, \ alpha ^ *, \ beta ^ *) \]

3.4 Theorem 3 (KTT conditions)

For the original problem and the dual problem, assuming the function \ (f (x) \) and \ (c_i (x) \) is convex, \ (h_j (X) \) is an affine function; and assuming inequality constraints \ (C_i (x) \) is strictly feasible that there \ (the X-\) , for all \ (i \) have \ (C_i (the X-) <0 \) , then \ (x ^ * \) is the original problem solution, \ (\ Alpha ^ *, \ Beta ^ * \) are necessary and sufficient conditions for the solution of the dual problem is \ (x ^ *, \ alpha ^ *, \ beta ^ * \) satisfies the following Karush-Kuhn-Tucker (the KKT) condition
\ [\ begin {align} & \ nabla_xL (x ^ *, \ alpha ^ *, \ beta ^ *) = 0 \\ & \ nabla_ \ alpha {L (x ^ *, \ alpha ^ *, \ beta ^ *)} = 0 \\ & \ nabla_ \ beta {L (x ^ *, \ alpha ^ *, \ beta ^ *)} = 0 \\ & \ alpha_i ^ * c_i (x ^ *) = 0 , \ quad {i = 1,2, \ cdots, k} \\ & c_i (x ^ *) \ leq0, \ quad {i = 1,2, \ cdots, k} \\ & \ alpha_i ^ * \ geq0 , \ quad {i = 1,2,
\ cdots, k} \\ & h_j (x ^ *) = 0, \ quad {j = 1,2, \ cdots, l} \ end {align} \] wherein\ (\ alpha_i ^ * c_i ( x ^ *) = 0, \ quad {i = 1,2, \ cdots, k} \) is complementary dual KKT conditions is seen from the condition: If \ (\ alpha_i ^ * > 0 \) , then \ (C_i (X ^ *) = 0 \) .

Guess you like

Origin www.cnblogs.com/nickchen121/p/11686756.html