7.3 Constrained optimization (条件优化)

本文为《Linear algebra and its applications》的读书笔记

Constrained optimization

Engineers, economists, scientists, and mathematicians often need to find the maximum or minimum value of a quadratic form Q ( x ) Q(\boldsymbol x) Q(x) for x \boldsymbol x x in some specified set. Typically, the problem can be arranged so that x \boldsymbol x x varies over the set of unit vectors. This constrained optimization problem has an interesting and elegant solution. Example 6 below and the discussion in Section 7.5 will illustrate how such problems arise in practice.

The requirement that a vector x \boldsymbol x x in R n \R^n Rn be a unit vector can be stated in several equivalent ways:

在这里插入图片描述
and

在这里插入图片描述
The expanded version (1) of x T x = 1 \boldsymbol x^T\boldsymbol x = 1 xTx=1 is commonly used in applications.

When a quadratic form Q Q Q has no cross-product terms, it is easy to find the maximum and minimum of Q ( x ) Q(\boldsymbol x) Q(x) for x T x = 1 \boldsymbol x^T\boldsymbol x = 1 xTx=1.

EXAMPLE 1
Find the maximum and minimum values of Q ( x ) = 9 x 1 2 + 4 x 2 2 + 3 x 3 2 Q(\boldsymbol x)= 9x_1^2+ 4x_2^2 + 3x_3^2 Q(x)=9x12+4x22+3x32 subject to the constraint x T x = 1 \boldsymbol x^T\boldsymbol x = 1 xTx=1.
SOLUTION

在这里插入图片描述

Furthermore, Q ( x ) = 9 Q(\boldsymbol x)=9 Q(x)=9 when x = ( 1 , 0 , 0 ) \boldsymbol x=(1, 0, 0) x=(1,0,0). Thus 9 is the maximum value of Q ( x ) Q(\boldsymbol x) Q(x) for x T x = 1 \boldsymbol x^T\boldsymbol x =1 xTx=1.

在这里插入图片描述
Also, Q ( x ) = 9 Q(\boldsymbol x)=9 Q(x)=9 when x = ( 0 , 0 , 1 ) \boldsymbol x=(0, 0,1) x=(0,0,1). Thus 3 is the minimum value of Q ( x ) Q(\boldsymbol x) Q(x) for x T x = 1 \boldsymbol x^T\boldsymbol x =1 xTx=1.

It is easy to see in Example 1 that the matrix of the quadratic form Q Q Q has eigenvalues 9 , 4 9, 4 9,4, and 3 3 3 and that the greatest and least eigenvalues equal, respectively, the (constrained) maximum and minimum of Q ( x ) Q(\boldsymbol x) Q(x). The same holds true for any quadratic form, as we shall see.

EXAMPLE 2
Let A = [ 3 0 0 7 ] A=\begin{bmatrix} 3& 0\\0 &7\end{bmatrix} A=[3007], and let Q ( x ) = x T A x Q(\boldsymbol x)= \boldsymbol x^TA\boldsymbol x Q(x)=xTAx for x \boldsymbol x x in R 2 \R^2 R2. Figure 1 displays the graph of Q Q Q. Figure 2 shows only the portion of the graph inside a cylinder; the intersection of the cylinder with the surface is the set of points ( x 1 , x 2 , z ) (x_1, x_2,z) (x1,x2,z) such that z = Q ( x 1 , x 2 ) z= Q(x_1, x_2) z=Q(x1,x2) and x 1 2 + x 2 2 = 1 x_1^2 + x_2^2 = 1 x12+x22=1.

The two highest points on the curve are 7 7 7 units above the x 1 x 2 x_1x_2 x1x2-plane, occurring where x 1 = 0 x_1= 0 x1=0 and x 2 = ± 1 x_2=\pm1 x2=±1. These points correspond to the eigenvalue 7 7 7 of A A A and the eigenvectors x = ( 0 , 1 ) \boldsymbol x=(0, 1) x=(0,1) and − x = ( 0 , − 1 ) -\boldsymbol x=(0,-1) x=(0,1). Similarly, the two lowest points on the curve are 3 3 3 units above the x 1 x 2 x_1x_2 x1x2-plane. They correspond to the eigenvalue 3 3 3 and the eigenvectors x = ( 1 , 0 ) \boldsymbol x=(1, 0) x=(1,0) and − x = ( − 1 , 0 ) -\boldsymbol x=(-1,0) x=(1,0).

在这里插入图片描述
The set of all possible values of x T A x \boldsymbol x^TA\boldsymbol x xTAx, for ∥ x ∥ = 1 \left\|x\right\|= 1 x=1, is the closed interval 3 ≤ t ≤ 7 3 \leq t\leq 7 3t7.

It can be shown that for any symmetric matrix A A A, the set of all possible values of x T A x \boldsymbol x^TA\boldsymbol x xTAx, for ∥ x ∥ = 1 \left\|x\right\|= 1 x=1, is a closed interval on the real axis. (See Exercise 13.) Denote the left and right endpoints of this interval by m m m and M M M, respectively. That is, let

在这里插入图片描述


EXERCISE 13
Let A A A be an n × n n\times n n×n symmetric matrix, let M M M and m m m denote the maximum and minimum values of the quadratic form x T A x \boldsymbol x^TA\boldsymbol x xTAx, where x T x = 1 \boldsymbol x^T\boldsymbol x=1 xTx=1; and denote corresponding unit eigenvectors by u 1 \boldsymbol u_1 u1 and u n \boldsymbol u_n un.

The following calculations show that given any number t t t between M M M and m m m, there is a unit vector x \boldsymbol x x such that t = x T A x t = \boldsymbol x^TA\boldsymbol x t=xTAx.

If m = M m = M m=M and x \boldsymbol x x is the unit eigenvector u 1 \boldsymbol u_1 u1, then x T A x = u 1 T A u 1 = u 1 T ( m u 1 ) = m \boldsymbol x^TA\boldsymbol x=\boldsymbol u_1^T A\boldsymbol u_1=\boldsymbol u_1^T (m\boldsymbol u_1 )= m xTAx=u1TAu1=u1T(mu1)=m.

Otherwise, 0 ≤ t − m ≤ M − m 0\leq t-m\leq M-m 0tmMm and 0 ≤ ( t − m ) / ( M − m ) ≤ 1 0\leq(t-m)/(M-m)\leq 1 0(tm)/(Mm)1. So set α = ( t − m ) / ( M − m ) \alpha=(t-m)/(M-m) α=(tm)/(Mm). Thus t = ( 1 − α ) m + α M t=(1-\alpha)m+\alpha M t=(1α)m+αM.

For such α \alpha α, let x = 1 − α u n + α u 1 \boldsymbol x=\sqrt{1-\alpha}\boldsymbol u_n+\sqrt\alpha\boldsymbol u_1 x=1α un+α u1. It can be shown that x T A x = t \boldsymbol x^TA\boldsymbol x=t xTAx=t and x T x = 1 \boldsymbol x^T\boldsymbol x=1 xTx=1.


在这里插入图片描述
PROOF
Orthogonally diagonalize A A A as P D P − 1 PDP^{-1} PDP1. We know that

在这里插入图片描述
Also,

在这里插入图片描述
Thus x T A x \boldsymbol x^TA\boldsymbol x xTAx and y T D y \boldsymbol y^TD\boldsymbol y yTDy assume the same set of values as x \boldsymbol x x and y \boldsymbol y y range over the set of all unit vectors. (Suppose that λ 1 ≥ λ 2 ≥ . . . ≥ λ n \lambda_1\geq\lambda_2\geq...\geq\lambda_n λ1λ2...λn)

x T A x = y T D y = ∑ i = 1 i = n λ i y i 2         ( ∗ ) \boldsymbol x^TA\boldsymbol x=\boldsymbol y^TD\boldsymbol y=\sum_{i=1}^{i=n}\lambda_iy_i^2\ \ \ \ \ \ \ (*) xTAx=yTDy=i=1i=nλiyi2       () ( ∗ ) ≤ λ 1 ∑ i = 1 i = n y i 2 ( ∗ )   h a s   t h e   m a x i m u m   v a l u e   w h e n   y = e 1 , x = P e 1 = u 1 ( ∗ ) ≥ λ n ∑ i = 1 i = n y i 2 ( ∗ )   h a s   t h e   m i n i m u m   v a l u e   w h e n   y = e n , x = P e n = u n (*)\leq \lambda_1\sum_{i=1}^{i=n}y_i^2\\(*)\ has\ the\ maximum\ value\ when\ \boldsymbol y=\boldsymbol e_1,\boldsymbol x=P\boldsymbol e_1=\boldsymbol u_1\\(*)\geq \lambda_n\sum_{i=1}^{i=n}y_i^2\\(*)\ has\ the\ minimum\ value\ when\ \boldsymbol y=\boldsymbol e_n,\boldsymbol x=P\boldsymbol e_n=\boldsymbol u_n ()λ1i=1i=nyi2() has the maximum value when y=e1,x=Pe1=u1()λni=1i=nyi2() has the minimum value when y=en,x=Pen=un

In Theorem 7 and in later applications, the values of x T A x \boldsymbol x^TA\boldsymbol x xTAx are computed with additional constraints on the unit vector x \boldsymbol x x.

在这里插入图片描述

The proof is similar to Theorem 6.

The next theorem generalizes Theorem 7 and, together with Theorem 6, gives a useful characterization of all the eigenvalues of A A A.

在这里插入图片描述

EXAMPLE 6
During the next year, a county government is planning to repair x x x hundred miles of public roads and bridges and to improve y y y hundred acres of parks and recreation areas. The county must decide how to allocate its resources (funds, equipment, labor, etc.) between these two projects. If it is more cost effective to work simultaneously on both projects rather than on only one, then x x x and y y y might satisfy a constraint such as

在这里插入图片描述
在这里插入图片描述

See Figure 3. Each point ( x , y ) (x, y) (x,y) in the shaded feasible set represents a possible public works schedule for the year. The points on the constraint curve, 4 x 2 + 9 y 2 = 36 4x^2+ 9y^2 = 36 4x2+9y2=36, use the maximum amounts of resources available.

In choosing its public works schedule, the county wants to consider the opinions of the county residents. To measure the value, or u t i l i t y ( 效 用 ) utility(效用) utility(), that the residents would assign to the various work schedules ( x , y ) (x, y) (x,y), economists sometimes use a function such as

在这里插入图片描述
The set of points ( x , y ) (x, y) (x,y) at which q ( x , y ) q(x, y) q(x,y) is a constant is called an i n d i f f e r e n c e indifference indifference c u r v e curve curve(无差别曲线). Find the public works schedule that maximizes the utility function q q q.

SOLUTION
The constraint equation does not describe a set of unit vectors, but a change of variable can fix that problem. Rewrite the constraint in the form

在这里插入图片描述
and define

在这里插入图片描述
Then the constraint equation becomes

在这里插入图片描述
and the utility function becomes q ( 3 x 1 , 2 x 2 ) = ( 3 x 1 ) ( 2 x 2 ) = 6 x 1 x 2 q(3x_1, 2x_2)=(3x_1)(2x_2)= 6x_1x_2 q(3x1,2x2)=(3x1)(2x2)=6x1x2. Let x = [ x 1 x 2 ] \boldsymbol x =\begin{bmatrix} x_1\\x_2\end{bmatrix} x=[x1x2]. Then the problem is to maximize Q ( x ) = 6 x 1 x 2 Q(\boldsymbol x)= 6x_1x_2 Q(x)=6x1x2 subject to x T x = 1 \boldsymbol x^T\boldsymbol x= 1 xTx=1. Note that Q ( x ) = x T A x Q(\boldsymbol x)=\boldsymbol x^TA\boldsymbol x Q(x)=xTAx, where

在这里插入图片描述
The eigenvalues of A A A are ± 3 \pm3 ±3, with eigenvectors

在这里插入图片描述
Thus the maximum value of Q ( x ) Q(\boldsymbol x) Q(x) is 3, attained when x 1 = 1 / 2 x_1 = 1/\sqrt2 x1=1/2 and x 2 = 1 / 2 x_2= 1/\sqrt2 x2=1/2 .

In terms of the original variables, the optimum public works schedule is x = 3 x 1 = 3 / 2 ≈ 2.1 x= 3x_1 = 3/\sqrt2\approx 2.1 x=3x1=3/2 2.1 hundred miles of roads and bridges and y = 2 x 2 = 2 ≈ 1.4 y= 2x_2 = \sqrt2\approx 1.4 y=2x2=2 1.4 hundred acres of parks and recreational areas. The optimum public works schedule is the point where the constraint curve and the indifference curve just meet.

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_42437114/article/details/108976441
7.3