Ceres Solver 官方教程学习笔记（Ⅹ）——自动微分法接口Interfacing with Automatic Differentiation

这篇文章翻译自官方教程Automatic Derivatives并且参考了少年的此间的博客文章Ceres-Solver学习笔记(5)

在成本函数的有一个显式表达式的情况下，自动微分算法很容易使用。但有时候这不太现实。通常程序都需要与外部的程序或数据进行交互。在这一章中，我们将考虑几种不同的方法来在这些特殊情况下使用自动微分法。

现在我们考虑一个优化问题。寻找参数 $\theta$ 和 $t$ 使

\begin{aligned} min & \sum_{i} {‖ y_{i} - f (‖ q_{i} ‖^{2}) q_{i} ‖}^{2} \\ 其中 & q_{i} = R (θ) x_{i} + t \end{aligned}

$\begin{split}\min & \quad \sum_i \left \|y_i - f\left (\|q_{i}\|^2\right) q_i \right \|^2\\ \text{其中} & \quad q_i = R(\theta) x_i + t\end{split}$
在这里，

R

$R$ 是一个二维旋转矩阵，依赖于旋转角度

θ

$θ$ 。

t

$t$ 是一个二维向量，表示位移。

f

$f$ 是一个外部畸变函数。

首先考虑这种情况，我们有一个模板函数TemplatedComputeDistortion 可以计算函数 $f$ 。然后，对应的残差的代码实现如下：

template <typename T> T TemplatedComputeDistortion(const T r2) {
  const double k1 = 0.0082;
  const double k2 = 0.000023;
  return 1.0 + k1 * y2 + k2 * r2 * r2;
}

struct Affine2DWithDistortion {
  Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {
    x[0] = x_in[0];
    x[1] = x_in[1];
    y[0] = y_in[0];
    y[1] = y_in[1];
  }

  template <typename T>
  bool operator()(const T* theta,
                  const T* t,
                  T* residuals) const {
    const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
    const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
    const T f = TemplatedComputeDistortion(q_0 * q_0 + q_1 * q_1); // !!!
    residuals[0] = y[0] - f * q_0;
    residuals[1] = y[1] - f * q_1;
    return true;
  }

  double x[2];
  double y[2];
};

但现在让我们考虑三种特殊情况。如果 $f$ 函数不能直接用于自动区分，常见的比如：
1. $f$ 是一个非模板求值函数。
2. $f$ 是一个可以计算值和微分的非模板函数。
3. $f$ 是一个待插值的值表函数。
下面我们依次探讨这些情况。

返回值的非模板函数

假设我们有一个函数，其声明如下：

double ComputeDistortionValue(double r2);

函数的具体内部实现不重要。将这个函数对接到Affine2DWithDistortion中需要三步：

把ComputeDistortionValue封装成函数ComputeDistortionValueFunctor。
对ComputeDistortionValueFunctor使用NumericDiffCostFunction进行数值微分，从而创建CostFunction.
使用CostFunctionToFunctor 封装CostFunction。封装后得到一个带有模板化操作符operator()的函数。这个操作符operator()方法可以将NumericDiffCostFunction计算出的雅可比矩阵变成Jet对象。

以上步骤的具体代码如下：

struct ComputeDistortionValueFunctor { // 第一步
  bool operator()(const double* r2, double* value) const {
    *value = ComputeDistortionValue(r2[0]);
    return true;
  }
};

struct Affine2DWithDistortion {
  Affine2DWithDistortion(const double x_in[2], const double y_in[2]) { // 构造函数，在初始化过程中完成转化
    x[0] = x_in[0];
    x[1] = x_in[1];
    y[0] = y_in[0];
    y[1] = y_in[1];

    compute_distortion.reset(new ceres::CostFunctionToFunctor<1, 1>(  // 第三步（外层函数）
         new ceres::NumericDiffCostFunction<ComputeDistortionValueFunctor, // 第二步（内层函数）
                                            ceres::CENTRAL,
                                            1,
                                            1>(
            new ComputeDistortionValueFunctor)));
  }

  template <typename T>
  bool operator()(const T* theta, const T* t, T* residuals) const {
    const T q_0 = cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
    const T q_1 = sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
    const T r2 = q_0 * q_0 + q_1 * q_1;
    T f;
    (*compute_distortion)(&r2, &f); // 变成一个模板类compute_distortion
    residuals[0] = y[0] - f * q_0;
    residuals[1] = y[1] - f * q_1;
    return true;
  }

  double x[2];
  double y[2];
  std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion;//先定义
};

返回值和微分的非模板函数

现在假设我们有一个函数ComputeDistortionValue，可以得到它的值，并且可以根据需要获取其雅可比矩阵。其函数声明如下：

void ComputeDistortionValueAndJacobian(double r2,
                                       double* value,
                                       double* jacobian);

同样，函数的实际实现并不重要。处理这个函数需要两步：

与第一种情况相比，这里直接可以求出雅可比矩阵，所以可以直接构建CostFunction。而不需要先准备Functor

把ComputeDistortionValueAndJacobian封装到一个CostFunction对象内。这个CostFunction对象我们称为ComputeDistortionFunction。
用CostFunctionToFunctor封装刚刚得到的ComputeDistortionFunction对象。得到一个带有模板操作符operator()方法的Functor，它将由NumericDiffCostFunction计算出的雅可比矩阵变成适合Jet对象。

代码如下：

class ComputeDistortionFunction : public ceres::SizedCostFunction<1, 1> { // 第一步
 public:
  virtual bool Evaluate(double const* const* parameters,
                        double* residuals,
                        double** jacobians) const {
    if (!jacobians) { // 如果不需要雅可比矩阵
      ComputeDistortionValueAndJacobian(parameters[0][0], residuals, NULL);
    } else {    // 如果需要雅可比矩阵
      ComputeDistortionValueAndJacobian(parameters[0][0], residuals, jacobians[0]);
    }
    return true;
  }
};

struct Affine2DWithDistortion {
  Affine2DWithDistortion(const double x_in[2], const double y_in[2]) {// 构造函数，在初始化过程中完成转化
    x[0] = x_in[0];
    x[1] = x_in[1];
    y[0] = y_in[0];
    y[1] = y_in[1];
    compute_distortion.reset(  // 第二步
        new ceres::CostFunctionToFunctor<1, 1>(new ComputeDistortionFunction));
  }

  template <typename T>
  bool operator()(const T* theta,
                  const T* t,
                  T* residuals) const {
    const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
    const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
    const T r2 = q_0 * q_0 + q_1 * q_1;
    T f;
    (*compute_distortion)(&r2, &f); // 变成一个模板类compute_distortion
    residuals[0] = y[0] - f * q_0;
    residuals[1] = y[1] - f * q_1;
    return true;
  }

  double x[2];
  double y[2];
  std::unique_ptr<ceres::CostFunctionToFunctor<1, 1> > compute_distortion; //先定义
};

定义为值表的函数

最后一个例子是，函数 $f$ 是一个被定义在区间 $[0,100)$ 的值表，每个整数都有一个对应的输出值。其本质就是一个向量。

vector<double> distortion_values;

有很多方法可以插入一个值表。也许最简单和最常用的方法是线性插值。但在这里线性插值不是个好办法，因为插值函数在抽样点处是不可微的。

另一个简单但是性能优异的可微插值方法是 Cubic Hermite Spline(中文埃尔米特插值) Ceres提供Cubic和Bi-Cubic插值的整个流程，并且可以很方便的应用自动微分算法。

使用 Cubic插值，首先需要构造一个Grid1D对象来包装值表，然后构造一个CubicInterpolator·对象来使用它。代码如下：

struct Affine2DWithDistortion {
  Affine2DWithDistortion(const double x_in[2],
                         const double y_in[2],
                         const std::vector<double>& distortion_values) {
    x[0] = x_in[0];
    x[1] = x_in[1];
    y[0] = y_in[0];
    y[1] = y_in[1];

    grid.reset(new ceres::Grid1D<double, 1>(
        &distortion_values[0], 0, distortion_values.size()));
    compute_distortion.reset(
        new ceres::CubicInterpolator<ceres::Grid1D<double, 1> >(*grid));
  }

  template <typename T>
  bool operator()(const T* theta,
                  const T* t,
                  T* residuals) const {
    const T q_0 =  cos(theta[0]) * x[0] - sin(theta[0]) * x[1] + t[0];
    const T q_1 =  sin(theta[0]) * x[0] + cos(theta[0]) * x[1] + t[1];
    const T r2 = q_0 * q_0 + q_1 * q_1;
    T f;
    compute_distortion->Evaluate(r2, &f);
    residuals[0] = y[0] - f * q_0;
    residuals[1] = y[1] - f * q_1;
    return true;
  }

  double x[2];
  double y[2];
  std::unique_ptr<ceres::Grid1D<double, 1> > grid;//先定义
  std::unique_ptr<ceres::CubicInterpolator<ceres::Grid1D<double, 1> > > compute_distortion;//先定义
};

在上面的例子中，我们使用了Grid1D和CubicInterpolator来插入一个一维的值表。Grid2D``与CubicInterpolator相结合可以用于插入二维值表。注意，无论是Grid1D还是Grid2D“`都不局限于标量值函数，它们也与向量值函数一起工作。