Optimization theory and technology (a)

Course content

  1. Preliminaries
  2. Linear Programming
  3. One-dimensional search method
  4. Unconstrained optimization method
  5. Constrained optimization method
  6. Application Optimization Project

Preliminaries

  1. Optimization problem
  2. Taylor formula multivariate function
  3. Multivariate function extreme problems
  4. Convex set, convex and convex optimization
  5. Algorithm related concepts
  6. Algorithm Overview

Optimization problem

Mathematical notation

\[minf(x)\\s.t \quad c(x)\ge 0\]

  • \ (x = (x_1, x_2 , ..., x_n) \) is a vector that contains multiple variables: decision variables
  • \ (c (x) \) is each variable equality and inequality constraints: Constraints
    • Feasible region: space constraints in an area surrounded
    • Feasible solution: each point of the feasible region feasible point is the original problem
  • \ (f (the X-) \) : the objective function
    • The optimal solution: make the objective function reaches a maximum or minimum feasible solution

classification

Press constraints

  • Unconstrained
  • Constrained
    • Equality constraints
    • Inequality constraints

According to the objective function

  • Linear Programming
  • Nonlinear Programming

As a function of variables

  • Integer Programming
  • Non-integer programming

According to the number of objective function

  • Single-objective optimization
  • Multi-objective optimization

Taylor formula multivariate function

Gradient multivariate function

Partial derivatives : polyols changes dimensionality reduction function, such as a fixed binary function \ (Y \) , so that only the \ (X \) changes alone, as so on \ (X \) Changes in the membership function

\[f_x(x,y)=lim_{\Delta x \to 0}\frac{f(x+\Delta x,y)-f(x,y)}{\Delta x}\]

记作\(\frac{\partial f(x,y)}{\partial x}\)

Gradient : multivariate function \ (A \) point the direction of change in the fastest numerous changes in that direction; every variable is varied in the direction of the partial derivative of the specified variable, the overall variation function can be maximized (changes in the absolute value of the maximum).

\[gradA=(f_x(A),f_y(A),f_z(A))\]

Multivariate function extreme value and Hessian matrix

Reference Hessian matrix with polyhydric Function Extreme

A membership function extreme value problem : \ (F (X) = X ^ 2 \) , to find the first derivative \ (F '(X) = 2x \) , according to a first derivative of the Fermat's Theorem extreme points must be equal 0.

  • Fermat's theorem is a necessary condition, the first derivative 0 = Release may store extreme value, but not introduced by the first derivative extremum = 0
  • The quadratic function, the first derivative determined extremum = 0, but (F (X) = X ^. 3 \) \ , check only the first derivative of the result is insufficient Release
  • For \ (F (X) = X ^. 3 \) , then the second derivative, if \ (F '' <0 \) , described function to obtain the local maximum point; if \ (f ''> 0 \) described function to obtain the local minimum point; if \ (F '' = 0 \) , illustrates the results uncertain any course, need otherwise determination function extremum

Multivariate function extremum problems : (F = F (X, Y, Z) \) \ , for each variable were first partial derivatives, may be determined extremum function

\[\frac{\partial f}{\partial x}=0\\\frac{\partial f}{\partial y}=0\\\frac{\partial f}{\partial z}=0\]

Subsequently, continuing the second derivative, the deflector comprising mixing a total of nine partial derivative, represented by a matrix obtained

\[H=\begin{matrix}\frac{\partial ^2f}{\partial x \partial x} & \frac{\partial ^2 f}{\partial x \partial y } & \frac{\partial ^2 f}{\partial x \partial z } \\ \frac{\partial ^2f}{\partial y \partial x} & \frac{\partial ^2 f}{\partial y \partial y } & \frac{\partial ^2 f}{\partial y \partial z } \\ \frac{\partial ^2f}{\partial z \partial x} & \frac{\partial ^2 f}{\partial z \partial y } & \frac{\partial ^2 f}{\partial z \partial z}\end{matrix}\]

Matrix \ (H \) is a third-order Hessian matrix. Extended to the general case, the domain of definition of a twice continuously differentiable function of guiding the substance polyhydric \ (f (x_1, x_2, ..., x_n) \) define Hessian matrix \ (H \) below

\[H=\begin{matrix}\frac{\partial ^2f}{\partial x_1 \partial x_1} & \frac{\partial ^2 f}{\partial x_1 \partial x_2 } & \dots &\frac{\partial ^2 f}{\partial x_1 \partial x_n } \\ \frac{\partial ^2f}{\partial x_2 \partial x_1} & \frac{\partial ^2 f}{\partial x_2 \partial x_2 } & \dots & \frac{\partial ^2 f}{\partial x_2 \partial x_n } \\ \vdots & \vdots & \ddots & \vdots \\\frac{\partial ^2f}{\partial x_n \partial x_1} & \frac{\partial ^2 f}{\partial x_n \partial x_2 } & \dots & \frac{\partial ^2 f}{\partial x_n \partial x_n}\end{matrix}\]

When the second derivative functions of one variable = 0, the point can not be determined extremum of the function. Similarly, when the determinant of the Hessian matrix = 0, can not conclude that the situation of extreme value multivariate function. And you may even get a saddle point, which is neither a maximum nor minimum point value.

Based on the Hessian matrix, it can be determined extremum multivariate function as follows:

  1. If the Hessian matrix is ​​positive definite matrix, the critical point is at a local minimum value
  2. If the Hessian matrix is ​​negative definite matrix, the critical point is at a local maximum
  3. If the Hessian matrix is ​​uncertain, it is not the extreme value at the critical point

Positive definite matrix to determine whether:

  1. Type master sequence; Sufficient Conditions real symmetric matrix is ​​positive definite matrices each sequential principal minors are greater than zero
  2. Eigenvalues; Guadratic positive definite matrix quadratic form iff full matrix eigenvalues ​​is greater than zero; negative definite quadratic necessary and sufficient condition that a full matrix of eigenvalues ​​less than zero; otherwise uncertain

Taylor expansion

Taylor formula unary functions of : setting a univariate function \ (f (x) \) containing the point \ (x_0 \) of the open interval \ ((a, b) \ ) having an inner \ (+ 1 n \) order derivative, then when \ (x \ in (a, b) \) when, \ (F (X) \) a \ (n-\) order Taylor formula is:

\[f(x)=f(x_0)+f'(x_0)(x-x_0)+\frac{f''(x_0)}{2!}(x-x_0)^2+...+\frac{f^{(n)}(x_0)}{n!}(x-x_0)^n+R_n(x)\]

Which, \ (R_n (the X-) \) expression in the form of more than Lagrange items

\[R_n(x)=\frac{f^{(n+1)}(\xi)}{(n+1)!}(x-x_0)^{n+1}\quad \xi \in (x,x_0)\]

\ (R_n (x) \) expression in the form of more than Renzo Piano items

\[R_n(x)=o[(x-x_0)^n]\]

Dual Function Taylor formula : provided binary function \ (z = f (x, y) \) at the point \ ((x_0, y_0) \ ) in a field with a continuously and up \ (n + 1 \) the sequential order partial derivatives, there

\[f(x,y)=f(x_0,y_0)+[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]f(x_0,y_0)+\\ \frac{1}{2!}[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^2f(x_0,y_0)+...\\ +\frac{1}{n!}[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^nf(x_0,y_0)\\+R_n(x,y)\]

Among them, notation

\[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]f(x_0,y_0)\]

Show

\[(x-x_0)f_x(x_0,y_0)+(y-y_0)f_y(x_0,y_0)\]

mark

\[[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^2f(x_0,y_0)\]

Show

\[(x-x_0)^2f_{xx}(x_0,y_0)+2(x-x_0)(y-y_0)f_{xy}(x_0,y_0)+(y-y_0)^2f_{yy}(x_0,y_0)\]

In general, the mark

\[[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^mf(x_0,y_0)\]

Show

\[\sum_{p=0}^m C_m^p(x-x_0)^p(y-y_0)^{(m-p)} \frac{\partial^m f}{\partial x^p\partial y^{(m-p)}}|_{(x_0,y_0)}\]

Rewrite the above equation with a generalized expression

\[f(x,y)=\sum_{k=0}^n\frac{1}{k!}[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^kf(x_0,y_0)\\ +R_n(x,y)\]

Lagrange remainder as:

\[R_n(x,y)=\frac{1}{(n+1)!}[(x-x_0)\frac{\partial}{\partial x}+(y-y_0)\frac{\partial}{\partial y}]^{(n+1)}f(x_0+\theta(x-x_0),y_0+\theta(y-y_0))\\\theta \in (0,1)\]

Piano remainder as:

\ [R_n (x, y) = o (\ rho ^ n) \]

Taylor expansion of the Hessian matrix relationship : For a multi-dimensional vector \ (X-\) , multivariate function \ (f (X) \) at the point \ (X_0 \) a continuous second order partial derivative in the art, may be written \ ( f (X) \) at the point \ (X_0 \) at the second-order Taylor expansion

\[f(\mathbf{X})=f(\mathbf{X}_0)+(\mathbf{X}-\mathbf{X}_0)^T\nabla f(\mathbf{X}_0)+\frac{1}{2!}(\mathbf{X}-\mathbf{X}_0)^T\nabla^2 f(\mathbf{X}_0)(\mathbf{X}-\mathbf{X}_0)+o(\|\mathbf{X}-\mathbf{X}_0\|^2)\]

And \ (\ nabla ^ 2 f ( \ mathbf {X} _0) \) is clearly a Hessian matrix, can be written as:

\[f(\mathbf{X})=f(\mathbf{X}_0)+(\mathbf{X}-\mathbf{X}_0)^T\nabla f(\mathbf{X}_0)+\frac{1}{2}(\mathbf{X}-\mathbf{X}_0)^T\mathbf{H}(\mathbf{X}_0)(\mathbf{X}-\mathbf{X}_0)+o(\|\mathbf{X}-\mathbf{X}_0\|^2)\]

  1. A necessary condition to obtain multivariate function extremum: \ (U = F (x_1, x_2, ..., x_n) \) at the point \ (M \) at the extreme values have, there

    \[\nabla f(M)=\left \{\frac{\partial f}{\partial x_1},\frac{\partial f}{\partial x_2},\cdots, \frac{\partial f}{\partial x_n}\right\}_M=0\]

  2. Extreme conditions sufficient to obtain a multivariate function: Hessian matrix of second order partial derivatives of definite composition (local minimum value) or negative reference (local maxima)

Convex set, convex and convex optimization problem

Convex sets

Set \ (C \) line segment between any two points are set in the \ (C \) inside, a collection called \ (C \) is a convex set:

\[\lambda x +(1-\lambda)y \in C\quad for \quad \forall \lambda \in (0,1),\quad \forall(x,y)\in C\]

Convex function

Define convex sets \ (C \) on the convex function:

\[f(\lambda x_1+(1-\lambda)x_2)\le \lambda f(x_1)+(1-\lambda)f(x_2) \\x_1,x_2 \in C;\quad \lambda \in (0,1)\]

Convex Optimization

Machine learning is the main job of the optimization problem, initialize at weight parameters, and then use optimization methods to optimize the weights until the accuracy rate does not rise, iteration stops. In optimization problems, the most widely used is a convex optimization problem:

  • If the feasible region is convex set
  • And the objective function is a convex function

Then this optimization problem is convex optimization problem

Guess you like

Origin www.cnblogs.com/ColleenHe/p/11567362.html