最优控制理论八、CasADi求解路径约束轨迹优化的多重打靶法

上一篇博客我总结了目前工程中求解最优控制问题的常用数值方法，最优控制理论七、关于数值求解算法，作为本节的基础性知识。
本博客利用CasADi的MATLAB版求解一个带有路径约束、终端约束的开环最优控制问题，采用最常用的多重打靶法，其中CasADi的功能在于对决策变量、约束变量进行自动微分，从而加快非线性规划的收敛速度。

受路径约束的最速下降曲线问题

以下使用了一个受约束的最速下降曲线问题，求解从点 $O$ 到 $P$ 的转移曲线，使其服从无摩擦理想情况重力下降的动力学约束 $\dot{\mathbf x}=f[({\mathbf x}(t),\theta(t),t]$ ，以及路径约束 $y\leq 1/2 x+1$ . 具体描述为：
请添加图片描述

定义状态变量为 $\mathbf x=[x,y,v]^\mathrm T$ ，已知始末端点 $\mathbf x(t_0)=[0,0,0]^\mathrm T,\ \mathbf x(t_f)=[2,2,free]^\mathrm T$ , 转移路径服从动力学约束 $\left\{\begin{array}{l} \dot{x}=v\cos\theta\\ \dot{y}=v \sin \theta \\ \dot{v}=g \sin \theta \end{array}\right.$

求最优的曲线形状(弹道倾角) $\theta(t)$ 使转移时间 $J=\int_{t_{0}}^{t_{f}}1 d t$ 最小。终端约束 $\psi(\mathbf x(t_f))=[x_f-2,y_f-2]=0$ .路径约束 $S[\mathbf x(t),t]=y-1/2\cdot x-1\leq0$

由于最短时间问题直接求解不太容易，因此需要对原问题进行变换、

对自变量时间 $t$ 乘以一个到达时间 $t_f$ 的变换后可转化成固定终端时间的OCP，具体步骤如下：
$\dot{x}=\frac{d x}{d t}=\frac{d x}{d \tau} \frac{d \tau}{d t}\\$

定义： $()^{\prime}=\frac{d(\cdot)}{d \tau}, \frac{d t}{d \tau}=t_f$

则 $\tau \in[0,1]$ 变化时，有 $\in[0, t_f]$ . $\left\{\begin{array} { l } { \dot x = v \cos\theta } \\ { \dot { y } = v \sin \theta } \\ { \dot { v } = g \sin \theta } \end{array} \Rightarrow \left\{\begin{array}{l} x^{\prime}=\tau v \cos \theta \\ y^{\prime}=\tau v \sin \theta \\ v^{\prime}=\tau g \sin \theta \end{array}\right.\right.$

OCP求解的DMS法

最优控制问题的求解方法很多，以下我们就利用直接多重打靶法（direct multiple shooting，DMS），离散动力学方程，然后用RK4求解每一段的ODE初值问题，最终把问题转化为多约束非线性规划问题。

请添加图片描述

这里决策变量为每一段的初值状态 $\mathbf x_i,i=0,1,\cdots N$ 和每一段近似保持不变的弹道倾角 $\theta_i,i=0,1,\cdots N$ . 算法DMS相对DSS的优势是，它有更多决策变量以及更多等式约束，但是由于每个决策变量对系统cost函数的影响是稀疏性的，因此OCP更容易收敛。

利用CasADi求解最优控制问题

CasADi是一个开源的非线性优化求解工具包，有MATLAB、Python、C++等主流数值优化语言的版本，支持各种平台，其官网https://web.casadi.org/。它的一部分特性包括：基于符号变量（symbolic variable）的解析计算、可以自动求差分（Automatic Differentiation），因此非常适合高效求解大规模非线性规划问题。

这里采用自动微分进行优化问题求解的原因在于，本问题中我没有手动提供Jacobian矩阵，而CasADi自动求了微分，就可以避免利用前向差分法数值求解导数而产生的计算问题。

下载CasADi v3.5.5最新版本的MATLAB版本，然后将其解压，并使MATLAB添加到路径>整个文件夹，然后再包含内部的+casadi文件夹，这样就算安装完成了。

然后下载官方的用户学习案例Example package ，我们接下来要在他的基础上进行求解。

求解NLP问题的一个案例

由于CasADi是直接拿来求解NLP问题的，它的标准形式可以参考以下这个文件，也就是example/matlab/rosenbrock.m，是一个求解非线性规划问题的求解程序。

%
% an example of Nolinear programming
%     This file is part of CasADi in directory: example/matlab/rosenbrock.m
%
%
% Load CasADi
import casadi.*
 
% Create NLP: Solve the Rosenbrock problem:
%     minimize    x^2 + 100*z^2
%     subject to  z + (1-x)^2 - y == 0
x = SX.sym('x');
y = SX.sym('y');
z = SX.sym('z');
v = [x;y;z];			% decision variables
f = x^2 + 100*z^2;		% cost function
g = z + (1-x)^2 - y;	% constraints
nlp = struct('x', v, 'f', f', 'g', g);

% Create IPOPT solver object
solver = nlpsol('solver', 'ipopt', nlp);

% Solve the NLP
res = solver('x0' , [2.5 3.0 0.75],... % solution guess
             'lbx', -inf,...           % lower bound on x
             'ubx',  inf,...           % upper bound on x
             'lbg',    0,...           % lower bound on g
             'ubg',    0);             % upper bound on g
 
% Print the solution
f_opt = full(res.f)          % >> 0
x_opt = full(res.x)          % >> [0; 1; 0]
lam_x_opt = full(res.lam_x)  % >> [0; 0; 0]
lam_g_opt = full(res.lam_g)  % >> 0

这个程序非常简单，我们可以用来代替MATLAB@fmincon函数，作为以下求解OCP问题的参考。

直接转化多重打靶法求解OCP的案例

由于其官网上一个例子和我的需求非常像，因此我在这里就直接拿过来改一改，位于CasADi案例文件夹example\matlab下的direct_multiple_shooting.m脚本。

我这里的状态 $x$ 初始猜测都是设定的0，相当于是没有设定初值的；控制初始猜测设为 $\pi/4$ ，也就相当于一条直线下降。虽然我这里收敛了，但是一些复杂的路径约束就不一定，那些情况下提供初始猜测是必要的。

% optimal control problem :path constrained Brachistochrone 
% uses direct transcription and multiple shooting 
%     This file is modified from \CasADi\example\matlab\direct_multiple_shooting.m
%
%     Based on CasADi -- A symbolic framework for dynamic optimization.

% An implementation of direct multiple shooting
% Siyang Meng .2021/8/1
%% 原问题设定
clear,clc
import casadi.*

T = 1; % Time horizon 0-1
N = 20; % number of control intervals

% Declare model variables
x = SX.sym('x');
y = SX.sym('y');
v = SX.sym('v');

X = [x; y; v];
u = SX.sym('theta');
tau = SX.sym('tau');% time scale
% Model equations
xdot = [tau*v*cos(u);
    tau*v*sin(u);
    tau*9.8*sin(u)];

% Objective term in lagrange form
% 由于是最短时间问题，因此Lagrange项和DynamicalModel都乘以时间缩放系数scale
L = tau;

%% Formulate discrete time dynamics
if false
   % CVODES from the SUNDIALS suite
   dae = struct('x',x,'p',u,'ode',xdot,'quad',L);
   opts = struct('tf',T/N);
   F = integrator('F', 'cvodes', dae, opts);
else
   % Fixed step Runge-Kutta 4 integrator
   M = 4; % RK4 steps per interval
   DT = T/N/M;
   
   % Continuous time dynamics
    f = Function('f', {
    
    X, u, tau}, {
    
    xdot, L});

   X0 = MX.sym('X0', 3);
   Udec = MX.sym('U');
   tdec =  MX.sym('tf');
   X = X0;
   Q = 0;
   for j=1:M
       [k1, k1_q] = f(X, Udec, tdec);
       [k2, k2_q] = f(X + DT/2 * k1, Udec, tdec);
       [k3, k3_q] = f(X + DT/2 * k2, Udec, tdec);
       [k4, k4_q] = f(X + DT * k3, Udec, tdec);
       X = X+DT/6*(k1 +2*k2 +2*k3 +k4);
       Q = Q + DT/6*(k1_q + 2*k2_q + 2*k3_q + k4_q);
   end
    % 离散化后的ODE系统
    F = Function('F', {
    
    X0, Udec, tdec}, {
    
    X, Q}, {
    
    'xin','uin', 'tauin'}, {
    
    'xf', 'qf'});
end

% 测试一下动力学方程代入数值会是正常输出吗。
% Evaluate at a test point
Fk = F('xin',[0; 0; 0],'uin',pi/4, 'tauin',0.8);
disp(Fk.xf)
disp(Fk.qf)
%% 构造CasADi可求解的NLP问题，初始化，

% Start with an empty NLP
w={
    
    };
w0 = [];% initial guess
lbw = [];% lower bound
ubw = [];% upper bound
J = 0;
g={
    
    };% constraints of NLP
lbg = [];
ubg = [];

% 动力学方程的初始条件
% "Lift" initial conditions
tau = MX.sym('tau');
Xk = MX.sym('X0', 3);
% w是决策变量
w = {
    
    tau, ...
    w{
    
    :}, Xk};
% 初始状态的上下限约束
lbw = [0.6;
    lbw;
    0; 0; 0];
ubw = [0.9;
    ubw; 
    0; 0; 0];
% 初始状态的猜测
w0 = [0.8;
    w0; 
    0; 0; 0];

%% 下面设定的包含路径约束、连续性等式约束、以及终端约束
% Formulate the NLP
for k=0:N-1
		% 求解IVP到此的时间。
		tk = T/N*(k+1);
		
    % 控制输入离散化的决策变量
	% New NLP variable for the control
    Uk = MX.sym(['U_' num2str(k)]);
    w = {
    
    w{
    
    :}, Uk};
    % 控制theta
    % 上下界
    lbw = [lbw; 
        0];
    ubw = [ubw;
        2*pi];
    % 猜测
    w0 = [w0;
        pi/4];%控制初值给定

    % Integrate till the end of the interval
    Fk = F('xin', Xk, 'uin', Uk, 'tauin', tau);
    Xk_end = Fk.xf;
    J=J+Fk.qf;

    % 状态变量离散化的决策变量
    % New NLP variable for state at end of interval
    Xk = MX.sym(['X_' num2str(k+1)], 3);
    w = [w, {
    
    Xk}];
    % 路径X
    % 上下界
    lbw = [lbw;
        -inf; -inf; -inf];
    ubw = [ubw;
        inf; inf; inf];
    % 猜测
    w0 = [w0;
        0; 0; 0];% 提供初始猜测时，这里可以添加插值表

    % Inequality constraint
    g = [g,...
    		 {
    
    Xk_end-Xk},...
        {
    
    -1/2*Xk_end(1)+Xk_end(2)-1}];% y>2* x-2
    % 上下界
    lbg = [lbg;
        0; 0; 0;
        -inf];
    ubg = [ubg;
        0; 0; 0;
        0];
end

% Add terminal equality constraint
g = [g, ...
		{
    
    Xk_end},...
   	{
    
    -1/2*Xk_end(1)+Xk_end(2)-1}];% y>2* x-2
lbg = [lbg;
    2; 2; -inf;
    -inf];
ubg = [ubg;
    2; 2; inf;
    0];
%% 设定并求解DMS
% Create an NLP solver
% 整理一下cell array
prob = struct('f', J, 'x', vertcat(w{
    
    :}), 'g', vertcat(g{
    
    :}));
solver = nlpsol('solver', 'ipopt', prob);

% Solve the NLP
sol = solver('x0', w0,...% 决策变量的初始猜测
            'lbx', lbw,...
            'ubx', ubw,...% 决策变量的上下界约束
            'lbg', lbg,...
            'ubg', ubg);% 非线性路径约束的上下界约束
w_opt = full(sol.x);

%% Plot the solution
% 决策变量的格式是前面已经确定下来的，只需要把它提取出来，格式如下，
%{
    
    
    [tau;
    X0;  U_0;
    X_1; U_1; 
    X_2; U_2; 
    ...;
    X_N-1; U_N-1;
    X_N];
% 在经过 vertcat(w{
    
    :})之后，变成 1 + 3*(N+1) + 1*N 的 casadi.MX 符号向量
%}
minT = w_opt(1),% 这是求解出的最短时间

x1_opt = w_opt(2:4:end);
x2_opt = w_opt(3:4:end);
x3_opt = w_opt(4:4:end);
u_opt = w_opt(5:4:end);

tgrid = minT * linspace(0, T, N+1);
clf;

figure(1)
plot(x1_opt, x2_opt, 'k-'),hold on;
plot([0,2],[0,2], 'ko'),
plot([0,2],[1,2], 'b--'),
axis([0 2 0 2])
xlabel('x'),ylabel('y'),title('trajectory')

figure(2)
stairs(tgrid, [u_opt; nan], 'k-.')
xlabel('otpimal time/ s'),ylabel('optimal control angle/ rad');
legend('\theta')

运行结果如下

EXIT: Optimal Solution Found.
      solver  :   t_proc      (avg)   t_wall      (avg)    n_eval
       nlp_f  |   1.00ms ( 58.82us)   1.04ms ( 61.18us)        17
       nlp_g  |   7.00ms (411.76us)   6.01ms (353.41us)        17
  nlp_grad_f  |   9.00ms (529.41us)   9.48ms (557.59us)        17
  nlp_hess_l  |  28.00ms (  1.87ms)  26.42ms (  1.76ms)        15
   nlp_jac_g  |  18.00ms (  1.06ms)  19.10ms (  1.12ms)        17
       total  | 301.00ms (301.00ms) 301.43ms (301.43ms)         1
minT =
    0.8250

这个运行结果比fmincon快多了。

请添加图片描述

直接配点多重打靶法

由于转化法是的计算效率比较低，又发展出了配点法，其核心思路是在连续几个配点节点上用插值多项式近似动力学方程，然后把动力学方程的连续性约束转化为配点节点上的连续性、以及多项式导数的连续性。采用配点法的求解效率高于直接转化法，当然它仍然存在它的缺点，不过对于大多数最优控制问题来说，是高效而适宜求解的。有关这两种方法的比较，可以参考Moritz Diehl[3]的笔记Numerical Optimal Control，在Sec.11.2与Sec11.3有相关描述。值得指出的是，目前，直接配点法（direct collacation）已经成为求解最优控制问题的最流行的方法，很多软件包都是默认直接采用这种求解方法的。

以下我又把上面的问题，给出了配点法的求解程序。

这里采用了基于CasADi的开源最优控制求解软件Yop，这个软件当然不是最好用的，但是确实很直观。下面我从它的官网摘一些必要的信息：

ODE或DAE系统的建立

mySystem = YopSystem(...
    'states', numberOfStates, ...
    'algebraics', numberOfAlgebraics, ...
    'controls', numberOfControlInputs, ...
    'externals', numberOfExternalInputs, ...
    'parameters', numberOfParameters, ...
    'model', @myModelFunction ...
    )

%% model
function [dx, alg, signals] = myModelFunction(t, x, z, u, w, p)
% x: states
% z: algebraics
% u: controls
% w: externals
% p: parameters
end

性能指标以及最优控制问题的建立

% An optimal control problem
ocp = YopOcp();

% Objective function to minimize
ocp.min({
    
     t_f( endStateCost ) '+' timeIntegral( integralCost ) })

% or if the aim is to maximize the objective:
ocp.max({
    
     t_f( endStateCost ) '+' timeIntegral( integralCost ) })

路径约束的给定

% Subject to the constraints
ocp.st(...
    'system', mySystem, ...                 % a YopSystem on ODE form
    {
    
     t0lb   '<=' t_0(t) '<=' t0ub   }, ... % independent variable initial value bounds
    {
    
     tflb   '<=' t_f(t) '<=' tfub   }, ... % independent variable final value bounds
    {
    
     x_min  '<='   x    '<=' x_max  }, ... % state bounds
    {
    
     x0_min '<=' t_0(x) '<=' x0_max }, ... % state initial value bounds
    {
    
     xf_min '<=' t_f(x) '<=' xf_max }, ... % state final value bounds
    {
    
     u_min  '<='   u    '<=' u_max  }, ... % control bounds
    {
    
     u0_min '<=' t_0(u) '<=' u0_max }, ... % control initial value bounds
% Upper bounded
{
    
     ub  '>=' expression }
{
    
     ub  '>=' t_0( expression ) }
{
    
     ub  '>=' t_f( expression ) }
% Equality constraint
{
    
     eq  '==' expression }
{
    
     eq  '==' t_0( expression ) }
{
    
     eq  '==' t_f( expression ) }
    );

求解器设定

sol = ocp.solve( ...
    'initialGuess', initialGuess, ...
    'controlIntervals', 50, ...
    'collocationPoints', 'legendre', ...
    'polynomialDegree', 3, ...
    'ipopt', struct('max_iter', 5000) ...
    );

可以看到Yop工具箱的操作模式非常直观，安装它也很简单，只需要在GitHub上下载然后用MATLAB包含它的路径即可。以下就是我写的一个求解程序，这里的路径约束改成了避障约束
$(x-1)^2+(y-1)^2\geq\frac1 4$

% optimal control problem :path constrained Brachistochrone 
% uses direct collacation and multiple shooting 
%     Based on Yop -- a open source optimal control toolbox
% it uses multiple shooting collcation method
%              CasADi -- A symbolic framework for dynamic optimization.
% it provides automatic differential for NLP solvers
% editor: Siyang Meng, from Northwestern Polytechnical University
% 2021年8月14日22:23:15
clear,clc
%%
% Create the Yop system
sys = YopSystem('states', 3, 'controls', 1);
% Symbolic variables
t = sys.t;
x = sys.x;
u = sys.u;

% Model
xdot = [x(3)*cos(u);
    x(3)*sin(u);
    9.8*sin(u)];

sys.set('ode',xdot)

Nodes = 15;

t_guess = linspace(0,1,Nodes);
y_guess = 2*t_guess;    x_guess = 1/2*y_guess.^2;
u_guess = pi/4*ones(1,Nodes);
w0 = YopInitialGuess(...
    'signals', [t; x; u], ... % Time-varying variables
    'signalValues', [t_guess; [x_guess;y_guess;t_guess]; u_guess] ... % Guess values
);

% An optimal control problem
ocp = YopOcp();

% Objective function to minimize
% ocp.min({
    
     timeIntegral( 1/2*trolley.acceleration^2 ) });
ocp.min({
    
     t_f( t ) });

% Subject to the constraints
ocp.st(...
    'systems', sys, ...
    ... % Initial conditions
    {
    
      0  '==' t_0( x(1) ) }, ...
    {
    
      0  '==' t_0(x(2)) }, ...
    {
    
      0  '==' t_0(x(3)) }, ...
        ... % Constraints
    {
    
    0 '<=' x(3) '<=' inf }, ...
    {
    
     0  '<='  u   '<=' pi/2   }, ...
    ... % Final conditions
    {
    
      2 '==' t_f( x(1) )   }, ...
    {
    
      2  '==' t_f( x(2) )   }, ...
    ... % Path Constraints
    strict({
    
     1/4 '<=' (x(2)-1)^2+(x(1)-1)^2 }) ...% -(Xk_end(1)-1)^2-(Xk_end(2)-1)^2+1/4
);

% Solving the OCP
Nodes = 100;
sol = ocp.solve(...
    'controlIntervals', Nodes, ...
    'collocationPoints', 'radau', ...
    'polynomialDegree', 1, ...
    'initialGuess', w0);

%% 
sol.NumericalResults.Objective
figure(1)
sol.plot(sys.x(1), sys.x(2), 'k-'),hold on;
plot([0,2],[0,2], 'ko'),
% plot([0,2],[1,2], 'b--'),
plot(1/2*cos([0:0.1:6.3])+1,1/2*sin([0:0.1:6.3])+1,'r--');

plot(x_guess,y_guess,'b--');

axis([0 2 0 2]),xlabel('x'),ylabel('y'),
legend('optimal trj','boundary points','path constraints','initial guess');
title('trajectory')

figure(2)
sol.stairs(sys.t, sys.u)
xlabel('otpimal time/ s'),ylabel('optimal control angle/ rad');
legend('\theta')

以下是求解结果：
请添加图片描述
可以发现如果初值给的没问题的话，就能正确满足力学约束，且最优控制是正确的。求解过程也可以发现求解速度快了很多。

external link

[1] CasADi官网https://web.casadi.org/
[2] 官方案例 Example package
[3] Numerical Optimal Control, 2011, [draft] Moritz Diehl
[4] Yop - an open-source optimal control software 导航页面
[5] Yop - Github下载页面：(a MATLAB toolbox)