Linear Programming and Simplex Method - Principles

introduction

Many operational research textbooks start with linear programming. When I usually do the application of algorithm strategies, I also develop some technical solutions based on linear programming. It can be said that if you do not understand linear programming, it is difficult to become an excellent operational research optimization algorithm engineer.

But when I was studying systematically, I made a big circle in other places before I came here.

The main reason is that the principle of this linear programming is really a bit difficult. I read it many times before, and there is always a sense of frustration that I seem to understand but not fully understand. After learning from the pain, I finally decided to start with the simplest unconstrained problem, and then gradually transition to the constrained problem, and the current linear programming problem.

For unconstrained optimization problems, I subdivided them into one-dimensional problems and multi-dimensional problems, and learned the golden section method , tangent method and advance and retreat method suitable for solving one-dimensional problems, as well as the coordinate rotation method and the fastest method for solving multi-dimensional problems. Descent method and quasi-Newton method , etc.; for constrained optimization problems, Lagrange multiplier method and penalty function method have been learned successively .

The linear programming problem is essentially a special kind of constrained optimization problem. This kind of optimization problem is very common in practical scenarios. At the same time, due to the particularity of the problem, it also brings new ideas to the efficient solution of the problem.

As mentioned earlier, linear programming is very difficult for me, and it is difficult to explain clearly in only one article. So I plan to divide it into two parts: this one focuses on explaining the principle of the algorithm, and the next one focuses on the practical application.

It must be warned that the follow-up content is very boring, because there is a lot of proof and reasoning process mixed in. I am willing to spend a lot of time on this, hoping not only to know what it is, but also to know why it is. If you don’t want to look at the detailed process, you can pay more attention to the overall logical framework: where is the particularity of the linear programming problem compared to the general constraint problem –> what characteristics does this particularity make the linear programming problem have –> based on these characteristics, the designed algorithm (Simple method) is how to achieve efficient solution.

standard form of linear programming

The standard matrix form of linear programming is
minf ( x ) = c T x st A x = b x ≥ 0 min \quad f(\pmb x)=\pmb c^T\pmb x \\ \text{st} \quad \pmb A\pmb x=\pmb b \\ \quad \quad \ \pmb x ≥ 0minf(x)=cTxs.t.Ax=b x0

It can also be written in the component form
minf ( x ) = ∑ j = 1 ncjxj st ∑ j = 1 naijxj = bi , i = 1 , 2 , . . . , mxj ≥ 0 , j = 1 , 2 , . . . , n min \quad f(x)=\sum_{j=1}^nc_jx_j \\ \text{st} \quad \sum_{j=1}^na_{ij}x_j=b_i, \quad i=1,2,. ..,m \\ \quad x_j ≥0 ,\quad j=1,2,...,n\\minf(x)=j=1ncjxjs.t.j=1naijxj=bi,i=1,2,...,mxj0,j=1,2,...,n

It can be seen from the definition that compared with general constrained optimization problems, linear programming problems have added restrictions on the objective function, constraints and variable range: (1) the objective function is an expression of linear summation; (2) the constraint The condition is only linear equality; (3) The optimization variable is greater than or equal to 0.

Problem characteristics

What are the characteristics of the linear programming problem after adding the above restrictions?

Let me talk about the conclusion first: the feasible domain of the problem is a convex set –> the optimal solution is on the vertex of the convex set –> the vertices correspond to the basic feasible solutions.

Next, explain in detail (proof) one by one.

(1) The feasible region is a convex set.

Now that convex sets are mentioned, it is necessary to define the meaning of convex sets first: Let S ∈ R n S \in R^nSRn isnnA point set in n- dimensional Euclidean space, if forx 1 ∈ S , x 2 ∈ S , x 1 ≠ x 2 x_1\in S, x_2 \in S,x_1≠x_2x1S,x2S,x1=x2, and any λ ∈ [ 0 , 1 ] \lambda\in [0, 1]l[0,1],必有
λ x 1 + ( 1 − λ ) x 2 ∈ S \lambda x_1+(1-\lambda)x_2 \in S λx1+(1l ) x2S
is calledSSS is a convex set.

The following are examples of convex and non-convex sets

For linear programming problems, according to the definition of standard constraints

A x 1 = b , x 1 ≥ 0 A x 2 = b , x 2 ≥ 0 \pmb A\pmb x_1=\pmb b, \pmb x_1 ≥ 0\\ \pmb A\pmb x_2=\pmb b, \pmb x_2 ≥ 0 Ax1=b,x10Ax2=b,x20
Then, for anyλ ∈ [ 0 , 1 ] \lambda\in[0, 1]l[0,1 ] , suppose the new variablex = λ x 1 + ( 1 − λ ) x 2 \pmb x=\lambda \pmb x_1+(1-\lambda)\pmb x_2x=λx1+(1l ) x2,可做方法推密
A [ λ x 1 + ( 1 − λ ) x 2 ] = λ A x 1 + ( 1 − λ ) A x 2 = λ b + ( 1 − λ ) b = b \pmb A[\ lambda \pmb x_1+(1-\lambda)\pmb x_2]=\lambda \pmb A\pmb x_1+(1-\lambda)\pmb A\pmb x_2=\lambda \pmb b + (1-\lambda) \pmb b=\pmb bA[λx1+(1l ) x2]=λAx1+(1λ)Ax2=λb+(1l ) b=b
means that the equality constraints are established.

Since x 1 , x 2 , λ ≥ 0 \pmb x_1,\pmb x_2,\lambda≥0x1,x2,l0 , the following formula is obviously also true
λ x 1 + ( 1 − λ ) x 2 ≥ 0 \lambda \pmb x_1+(1-\lambda)\pmb x_2≥0λx1+(1l ) x20
isx \pmb xIt is also true that x is greater than or equal to 0.

so x \pmb xx is also in the feasible region. According to the definition of a convex set, the feasible region of a linear programming problem is a convex set.

The feasible region is a convex set, what are the benefits - see the next conclusion.

(2) The optimal value must be reached at a certain vertex of the convex set.

The word vertex is easy to understand in the physical world, but for subsequent derivation, it is still necessary to describe this concept clearly in mathematical language: for x \pmb x in the feasible regionx , if there is noλ , x 1 ∈ S , x 2 ∈ S , x 1 ≠ x 2 \lambda,\pmb x_1\in S,\pmb x_2 \in S,\pmb x_1≠\pmb x_2l ,x1S,x2S,x1=x2, making
x = λ x 1 + ( 1 − λ ) x 2 \pmb x=\lambda \pmb x_1+ (1-\lambda)\pmb x_2x=λx1+(1l ) x2
Then this x \pmb xx is a vertex on a convex set.

To prove the above conclusion, the method of proof by contradiction is used here.

Suppose the vertices of the convex set are x 1 , x 2 , . . . , xk \pmb x_1,\pmb x_2,...,\pmb x_kx1,x2,...,xk, the optimal solution is x ∗ \pmb x^\astx . Becausex ∗ \pmb x^\astx is within the convex set of the feasible region, so it can be represented by vertices as
x ∗ = ∑ i = 1 k λ ixi \pmb x^\ast=\sum_{i=1}^k\lambda_i\pmb x_ix=i=1klixi
where λ i ≥ 0 \lambda_i≥0li0 ∑ i = 1 k λ i = 1 \sum_{i=1}^k\lambda_i=1 i=1kli=1 x \pmb x It is easy to imagine that x can be represented by a linear combination of vertices, but it is a bit difficult to prove, so I will not go into depth here. Those who are interested can refer to: the representationtheorem of multi-faceted sets.

Multiply both sides of the above formula at the same time by c T \pmb c^TcT
c T x ∗ = ∑ i = 1 k λ i c T x i \pmb c^T\pmb x^\ast=\sum_{i=1}^k\lambda_i\pmb c^T\pmb x_i cTx=i=1klicTxi
c T x s = min 1 ≤ i ≤ k c T x i \pmb c^T \pmb x_s= \mathop{\text{min}} \limits_{1≤i≤k} \pmb c^T \pmb x_i cTxs=1ikmincTxi, the above formula can be deduced as follows

c T x ∗ = ∑ i = 1 k λ i c T x i ≥ ∑ i = 1 k λ i c T x s = c T x s ∑ i = 1 k λ i = c T x s \pmb c^T\pmb x^\ast=\sum_{i=1}^k\lambda_i\pmb c^T\pmb x_i≥\sum_{i=1}^k\lambda_i\pmb c^T\pmb x_s=\pmb c^T\pmb x_s\sum_{i=1}^k\lambda_i=\pmb c^T\pmb x_s cTx=i=1klicTxii=1klicTxs=cTxsi=1kli=cTxs
But since x ∗ \pmb x^\astx is the optimal solution and is not on the vertex, so
c T x ∗ < c T xs \pmb c^T\pmb x^\ast < \pmb c^T\pmb x_scTx<cTxs
The above two formulas contradict each other, so the assumption is not true, that is, the optimal solution must be obtained at a certain vertex.

The biggest advantage of this conclusion is that it shrinks the decision space from a very large feasible domain space to the vertices of the feasible domain, and the number of vertices is relatively limited in most cases, which greatly reduces the complexity of problem solving Spend.

In fact, I still have a small doubt here. From the perspective of the proof process, it seems that the objective function does not need to be linear. Does it mean that even if the objective function is a nonlinear expression, the optimal solution is still obtained at the vertex? This doubt has not been clarified yet, if there is a big brother, I hope you can enlighten me.

(3) There is a one-to-one correspondence between the vertices of the convex set in the feasible region and the basic feasible solutions.

We already know that the optimal solution is on the vertex, so what is the mathematical expression of the vertex? The above conclusion tells us that the vertex is the basic feasible solution.

Next, let's look at the definition of a basic feasible solution.

A \pmb A A m × n m\times n m×The constraint matrix of n , from which a non-singular square matrix B \pmb Bcan be takenB , and then make the remaining columns into a sub-arrayN \pmb NN , their correspondingx \pmb xx are represented as basic variablesx B \pmb x_BxBand non-basic variable x N \pmb x_NxN. Equality constraints can be expressed as
B xb + N x N = b \pmb B\pmb x_b+\pmb N\pmb x_N=\pmb bBxb+NxN=bLet
x N = 0\pmb x_N=\pmb 0xN=0 , we can findxb = B − 1 b \pmb x_b=\pmb B^{-1}\pmb bxb=B1 b, combinationxb \pmb x_bxband x N \pmb x_NxN
x = ( B − 1 b , 0 ) T \pmb x=(\pmb B^{-1}\pmb b, 0)^T x=(B1b,0)TIf B − 1 b ≥ 0 \pmb B^{-1}\pmb
b≥0B1b0 , thenx \pmb xx is called the basic feasible solution.

B \pmb B The dimension of B ism × mm\times mm×m , theoretically at most from anynnRandomly selectmm from n columnsm columns, soB \pmb BThe upper limit of the number of B isC nm C_n^mCnm, that is, the upper limit of the number of basic feasible solutions is C nm C_n^mCnmindivual.

The above conclusion needs to be demonstrated in two steps: (1) the basic feasible solutions are vertices; (2) the vertices are all basic feasible solutions.

For the first step, continue to use the method of proof by contradiction.

Suppose a basic feasible solution is not a vertex, that is, there is a feasible solution x 1 , x 2 \pmb x_1,\pmb x_2x1,x2sum λ ∈ [ 0 , 1 ] \lambda\in[0,1]l[0,1 ] , making the basic feasible solutionx \pmb xx can be expressed as
x = λ x 1 + ( 1 − λ ) x 2 \pmb x=\lambda \pmb x_1+ (1-\lambda)\pmb x_2x=λx1+(1l ) x2
put x \pmb xx is written as the base variablex B \pmb x_BxBand non-basic variable x N \pmb x_NxNTwo parts
x B = λ x 1 B + ( 1 − λ ) x 2 B \pmb x_B=\lambda \pmb x_{1B}+ (1-\lambda)\pmb x_{2B}xB=λx1B+(1l ) x2 B
x N = λ x 1 N + ( 1 − λ ) x 2 N \pmb x_N=\lambda \pmb x_{1N}+ (1-\lambda)\pmb x_{2N} xN=λx1N+(1l ) x2 N
x N \pmb x_N xNIn the equation, because x N = 0 , λ > 0 , 1 − λ > 0 , x 1 N ≥ 0 , x 2 N ≥ 0 \pmb x_N=0,\lambda>0,1-\lambda>0,\ pmb x_{1N}≥0,\pmb x_{2N}≥0xN=0,l>0,1l>0,x1N0,x2 N0,故
x N = x 1 N = x 2 N = 0 \pmb x_N=\pmb x_{1N}=\pmb x_{2N}=0 xN=x1N=x2 N=0
sox 1 \pmb x_1x1and x 2 \pmb x_2x2are basically feasible solutions, that is,
B x 1 B = b \pmb B\pmb x_{1B}=\pmb bBx1B=b
B x 2 B = b \pmb B\pmb x_{2B}=\pmb b Bx2 B=b
due toB \pmb BB is a non-singular square matrix, soB x = b \pmb B\pmb x=\pmb bBx=b has a unique solution, so
x B = x 1 B = x 2 B \pmb x_B=\pmb x_{1B}=\pmb x_{2B}xB=x1B=x2 B
That is, x 1 , x 2 \pmb x_1,\pmb x_2x1,x2is the basic feasible solution x \pmb xx , does not match the basic assumption, so the basic feasible solution must be the vertex of the feasible solution.

For the second step, the method of proof by contradiction is still used.

Suppose x \pmb xIs x the vertex of the feasible region, or is it first disassembled into non-zero itemsx B \pmb x_BxBand zero term x N \pmb x_NxN
B x B = b \pmb B\pmb x_B=\pmb b BxB=b
but because it is not a basic feasible solution, it corresponds toB \pmb BB should be linearly related, that is, there is a set of non-zero vectorsw \pmb ww , making
B w = 0 \pmb B\pmb w=\pmb 0Bw=0
The above formula is multiplied by the coefficientδ \deltaδ , and then do addition and subtraction with the above formula respectively to get
B ( x B + δ w ) = b B ( x B − δ w ) = b \pmb B(\pmb x_B+\delta \pmb w)=\pmb b\\ \pmb B(\pmb x_B-\delta \pmb w)=\pmb bB(xB+δw)=bB(xBδw)=b
先令 x B 1 = x B + δ w , x B 2 = x B − δ w \pmb x_B^1=\pmb x_B+\delta \pmb w,\pmb x_B^2=\pmb x_B-\delta \pmb w xB1=xB+δw,xB2=xBδw,再令 x 1 = ( x B 1 , 0 ) , x 2 = ( x B 2 , 0 ) \pmb x_1=(\pmb x_B^1, \pmb 0),\pmb x_2=(\pmb x_B^2, \pmb 0) x1=(xB1,0),x2=(xB2,0 ) , first we can get
x = 1 2 x 1 + 1 2 x 2 \pmb x=\frac{1}{2}\pmb x_1+\frac{1}{2}\pmb x_2x=21x1+21x2
At the same time as long as δ \deltaδ is small enough
x ± δ w > = 0 \pmb x\pm\delta \pmb w >=\pmb 0x±δw>=0
isx 1 \pmb x_1x1and x 2 \pmb x_2x2are all feasible solutions, which means that x \pmb xx cannot be a vertex, and the two are contradictory, so the vertices must also be basic feasible solutions.

With this conclusion, in theory, as long as the objective function values ​​of all basic feasible solutions are calculated and the minimum value is returned, the optimal solution is obtained. But the upper limit of the number of basic feasible solutions is C nm C_n^mCnm, if nnThe number of n is relatively large, and it seems a bit overwhelming to use traversal to find the minimum value.

At this time, the famous simplex method should be on the stage.

simplex method

Also give the conclusion first, the most powerful part of the simplex method is: after arbitrarily given a basic feasible solution, after simple calculation and evaluation, it can tell us whether the solution has room for improvement, and if so, in which direction Improvement is best. In this case, it is no longer necessary to traverse all the basic feasible solutions, so the efficiency of problem solving can be greatly improved.

Suppose a basic feasible solution
x 0 = [ x B , 0 ] = [ B − 1 b , 0 ] \pmb x_0=[\pmb x_B,\pmb 0]=[\pmb B^{-1}\pmb b,\pmb 0]x0=[xB,0]=[B1b,0 ]
The corresponding objective function value is
f 0 = c BTB − 1 b f_0=\pmb c_B^T\pmb B^{-1}\pmb bf0=cBTB1b

First put x 0 \pmb x_0x0General expression written as a basic feasible solution [ x B , x N ] [\pmb x_B,\pmb x_N][xB,xN] , substitute into the equality constraint
B x B + N x N = b \pmb B\pmb x_B+\pmb N\pmb x_N=\pmb bBxB+NxN=b
transposition,x B \pmb x_BxBCan be expressed as
x B = B − 1 b − B − 1 N x N \pmb x_B=\pmb B^{-1}\pmb b-\pmb B^{-1}\pmb N\pmb x_NxB=B1bB1NxN

After the general expression is combined with the above formula, then substitute into the objective function expression
f = c BT ( B − 1 b − B − 1 N x N ) + c NT x N f=\pmb c_B^T(\pmb B^{- 1}\pmb b-\pmb B^{-1}\pmb N\pmb x_N)+\pmb c_N^T\pmb x_Nf=cBT(B1bB1NxN)+cNTxN
Reorganize
f = c BTB − 1 b − ( c BTB − 1 N − c NT ) x N f=\pmb c_B^T\pmb B^{-1}\pmb b-(\pmb c_B^T\pmb B ^{-1}\pmb N-\pmb c_N^T)\pmb x_Nf=cBTB1b(cBTB1NcNT)xN
The above formula can be described as
f = f 0 − ∑ j ∈ R ( zj − cj ) xjf=f_0-\sum_{j\in R}(z_j-c_j)x_jf=f0jR(zjcj)xj
Among them, RRR is a set of non-basic variables,z = c BTB − 1 N \pmb z=\pmb c_B^T\pmb B^{-1}\pmb Nz=cBTB1N

At current x 0 \pmb x_0x0, since xj = 0 x_j=0xj=0 , ieffThe last item of f is 0; so the basic condition for the objective function to continue to decrease is: there is at least onejjj,使得zj − cj > 0 z_j-c_j>0zjcj>0 , whilexj x_jxjIt can also increase from 0 to a positive number.

z j − c j z_j-c_j zjcjIt is a parameter not a variable, whether there is a jj greater than 0j , cannot be changed; butxj x_jxjThe value of can be optimized in theory, and of course the value cannot be increased arbitrarily. The constraint that needs to be satisfied is
x B = B − 1 b − B − 1 N x N ≥ 0 \pmb x_B=\pmb B^{- 1}\pmb b-\pmb B^{-1}\pmb N\pmb x_N≥0xB=B1bB1NxN0

Suppose xk ∈ x N x_k\in \pmb x_NxkxN, will change from 0 to a positive number (the technical term is called basis), in order to ensure the basic feasible solution requirements, x B \pmb x_BxBThere needs to be a value in 0 (the technical term is called the basis)
x B = [ x B 1 x B 2 ⋅ ⋅ ⋅ x B m ] = [ b ˉ 1 b ˉ 2 ⋅ ⋅ ⋅ b ˉ m ] − [ y 1 ky 2 k ⋅ ⋅ ⋅ ymk ] xk ≥ 0 \pmb x_B=\left [ \begin{matrix} x_{B1} \\ x_{B2} \\ ··· \\ x_{Bm} \\ \end{ matrix} \right ]= \left [ \begin{matrix} \bar b_1 \\ \bar b_2 \\ ··· \\ \bar b_m \\ \end{matrix} \right ]- \left [ \begin{matrix } y_{1k} \\ y_{2k} \\ ··· \\ y_{mk} \\ \end{matrix} \right ]x_k≥0xB= xB 1xB2 _⋅⋅⋅xBm = bˉ1bˉ2⋅⋅⋅bˉm y1 ky2 k⋅⋅⋅ymk xk0
Among them,b ˉ = B − 1 b \pmb {\bar b}=\pmb B^{-1}\pmb bbˉ=B1b y k = B − 1 N k \pmb y_k=\pmb B^{-1}\pmb N_k yk=B1Nk. In order to guarantee x B ≥ 0 \pmb x_B≥0xB0 x k x_k xkThe optimal value of is
xk = min { b ˉ 1 y 1 k , b ˉ 2 y 2 k , . . . , b ˉ mymk } x_k=min\{\frac{\bar b_1}{y_{1k}}, \frac{\bar b_2}{y_{2k}},...,\frac{\bar b_m}{y_{mk}}\}xk=min{ y1 kbˉ1,y2 kbˉ2,...,ymkbˉm}

Based on the above logic, we can describe it as: if there is kkk , such thatzk − ck > 0 z_k-c_k>0zkck>0 , at this timexk = minb ˉ iyik x_k=min{\frac{\bar b_i}{y_{ik}}}xk=minyibˉi, which can reduce the value of the objective function to the greatest extent.

Here are some special cases that need to be considered separately:

(1) Possession zi − ci ≤ 0 z_i-c_i≤0zici0 , ief ≤ f 0 f≤f_0ff0, the current basic feasible solution is the optimal solution.

(2) x k = m i n b ˉ i y i k x_k=min{\frac{\bar b_i}{y_{ik}}} xk=minyibˉi中, b ˉ i \bar b_i bˉiis the component of the original basic feasible solution, so its value ≥ 0 is beyond doubt; but yik y_{ik}yiThe size of is indeterminate if its partial value ≤ 0, combined with xk ≥ 0 x_k≥0xk0 self-constraint, the original process can continue; but if all ≤ 0, the process can no longer continue, in fact, even ifxk x_kxkTake infinity, x B = b ˉ − ykxk ≥ 0 \pmb x_B=\pmb {\bar b}-\pmb y_kx_k≥0xB=bˉykxk0 constraint can still be satisfied,fff will become infinitely small, so the original problem has no lower bound at this time, and there is no minimum value.

In summary, the basic steps of the simplex method can be summarized as follows:

(1) Transform the given linear programming problem into a standard form;

(2) Find an initial basic feasible solution.

(3) 检验zi − ci z_i-c_izici b ˉ i y i k \frac{\bar b_i}{y_{ik}} yibˉi, judge the state of the current solution: the optimal solution or no optimal solution, exit; otherwise continue.

(4) Find the best kkk , updatex B \pmb x_BxB, get a new basic feasible solution, go to (3).

Well, I finally figured out the algorithm principle for solving linear programming problems.

Guess you like

Origin blog.csdn.net/taozibaby/article/details/132136783