Nonlinear optimization and BA summary in SLAM

1. Nonlinear least squares problem

  Consider the simple problem first: min ⁡ x 1 2 ∣ ∣ f ( x ) ∣ ∣ 2 2 \underset{x}{\min} \frac{1}{2}||f(x)||_{2} ^{2}xmin21f(x)221. When ffWhen f is very simple: letdfdx = 0 \frac{df}{dx}=0dxdf=0 , the extreme point or saddle point will be obtained, just compare these solutions.
2. WhenffWhen f is complex (fff is an n-ary function):dfdx \frac{df}{dx}dxdfHard to find, or dfdx = 0 \frac{df}{dx}=0dxdf=0 is difficult to solve, and iterative method is used to solve it.

  The way to iterate is:

  1. Given some initial value x 0 x_{0}x0
  2. For the kth iteration, find a delta Δ xk \Delta x_{k}Δx _k,from ∣ ∣ f ( xk + Δ xk ) ∣ ∣ 2 2 ||f(x_{k}+\Delta x_{k})||_{2}^{2}f(xk+Δx _k)22reaches a minimum value.
  3. Δ xk \Delta x_{k}Δx _kIf small enough, stop.
  4. Otherwise, let xk + 1 = xk + Δ xk x_{k+1}=x_{k}+\Delta x_{k}xk+1=xk+Δx _k, returns 2.

  Here we need to determine the incremental method (ie gradient descent strategy): first-order or second-order. First, it needs to be Taylor expanded to get: ∣ ∣ f ( xk + Δ xk ) ∣ ∣ 2 2 ≈ ∣ ∣ f ( x ) ∣ ∣ 2 2 + J ( x ) Δ x + 1 2 Δ x TH Δ x || f(x_{k}+\Delta x_{k})||_{2}^{2} \approx ||f(x)||_{2}^{2} +J(x)\Delta x+ \frac{1}{2} \Delta x^{T}H\Delta xf(xk+Δx _k)22f(x)22+J(x)Δx+21Δx _T HΔx  ifOnly keep the first order gradient min ⁡ Δ x ∣ ∣ f ( x ) ∣ ∣ 2 2 + J Δ x \underset{\Delta x}{\min} ||f(x)||_{2}^{2} +J\Delta x Δx _minf(x)22+J Δ x , the direction of increment is:Δ x ∗ = − JT ( x ) \Delta x^{\ast} =-J^{T}(x)Δx _=JT (x). (usually also need to calculate the step size), the method is calledsteepest descent.
  likePreserve the second order gradient Δ x ∗ = a r g min ⁡ ∣ ∣ f ( x ) ∣ ∣ 2 2 + J ( x ) Δ x + 1 2 Δ x T H Δ x \Delta x^{\ast} =arg\min ||f(x)||_{2}^{2} +J(x)\Delta x+\frac{1}{2} \Delta x^{T}H\Delta x Δx _=argminf(x)22+J(x)Δx+21Δx _T HΔx, then get (make the above formula aboutΔ x \Delta xΔ x derivative is zero): H Δ x = − JTH \Delta x=-J^{T}HΔx=JThe method is calledNewton's method

  Although the steepest descent method and Newton's method are intuitive, there are some disadvantages in their use:

  1. The steepest descent method may lead to an increase in the number of iterations due to being too greedy
  2. The number of iterations of Newton's method is small, but it needs to calculate the complex Hessian matrix

  Therefore, the calculation of the Hessian can be circumvented by Gauss-Newton and Levenberg-Marquadt.

2. Understand Gauss-Newton, Levenburg-Marquadt and other descending strategies

Gauss-Newton

First order approximation f ( x ) f(x)f(x) f ( x + Δ x ) ≈ f ( x ) + J ( x ) Δ x f(x+\Delta x)\approx f(x)+J(x)\Delta x f(x+Δ x )f(x)+J(x)Δx
平方误差变为: 1 2 ∣ ∣ f ( x ) + J ( x ) Δ x ∣ ∣ 2 = 1 2 ( f ( x ) + J ( x ) Δ x ) T ( f ( x ) + J ( x ) Δ x ) = 1 2 ( ∣ ∣ f ( x ) ∣ ∣ 2 2 + 2 f ( x ) T J ( x ) Δ x + Δ x T J ( x ) T J ( x ) Δ x ) \frac{1}{2} ||f(x)+J(x)\Delta x||^{2}=\frac{1}{2}(f(x)+J(x)\Delta x)^{T}(f(x)+J(x)\Delta x) =\frac{1}{2}(||f(x)||_{2}^{2} +2f(x)^{T}J(x)\Delta x+\Delta x^{T}J(x)^{T}J(x)\Delta x) 21f(x)+J(x)Δx2=21(f(x)+J(x)Δx)T(f(x)+J(x)Δx)=21(f(x)22+2f(x)TJ(x)Δx+Δx _TJ(x)T J(x)Δx)let aboutΔ x \Delta xΔ x derivative is zero:2 J ( x ) T f ( x ) + 2 J ( x ) TJ ( x ) Δ x = 0 2J(x)^{T}f(x)+2J(x)^{T }J(x)\Delta x=02J ( x ) _Tf(x)+2J ( x ) _TJ(x)Δx=0 J ( x ) T J ( x ) Δ x = − J ( x ) T f ( x ) J(x)^{T}J(x)\Delta x=-J(x)^{T}f(x) J(x)TJ(x)Δx=J(x)T f(x)
is written as:H Δ x = g H \Delta x=gHΔx=g
  GN approximates H with the expression of J.
Proceed as follows:

  1. Given initial value x 0 x_{0}x0
  2. For the kth iteration, find the current Jacobian matrix J ( xk ) J(x_{k})J(xk) and errorf ( xk ) f(x_{k})f(xk)
  3. Solve the delta equation: H Δ xk = g H \Delta x_{k}=gHΔxk=g
  4. Δ xk \Delta x_{k}Δx _kIf small enough, stop. Otherwise, let xk + 1 = xk + Δ xk x_{k+1}=x_{k}+ \Delta x_{k}xk+1=xk+Δx _k

Levenberg-Marquadt

  Gauss-Newton is simple and practical, but Δ xk = H − 1 g \Delta x_{k}=H^{-1}gΔx _k=H− H cannot be guaranteed to be invertible in 1 g(the quadratic approximation is not reliable).
  And the Levenberg-Marquadt method improves it to some extent.
  GN belongs to the line search method: find the direction first, and then determine the length; LM belongs to the trust region method, thinking that the approximation is only reliable in the region.
  Consider the description of the degree of approximation in LM ρ = f ( x + Δ x ) − f ( x ) J ( x ) Δ x \rho =\frac{f(x+\Delta x)-f(x)}{J( x)\Delta x}r=J(x)Δxf(x+Δx)f(x)i.e. actual drop/approximate drop. If it is too small, reduce the approximation range; if it is too large, increase the approximation range.

  The process of LM is as follows:

  1. Given initial value x 0 x_{0}x0, and the initial optimal radius μ \mum .
  2. For the kth iteration, solve for: min ⁡ Δ xk 1 2 ∣ ∣ f ( xk ) + J ( xk ) Δ xk ∣ ∣ 2 , s . t . ∣ ∣ D Δ xk ∣ ∣ 2 ≤ μ \underset{\Delta x_{k}}{\min} \frac{1}{2} ||f(x_{k})+J(x_{k}) \Delta x_{k}||^{2},st|| D\Delta x_{k}||^{2}\le \muΔx _kmin21f(xk)+J(xk) Δx _k2,s.t.DΔxk2μ whereμ \muμ is the radius of the trust region.
  3. Calculate ρ \rhor .
  4. If ρ > 3 4 \rho>\frac{3}{4}r>43,则μ = 2 μ \mu = 2\mum=2 m ;
  5. ρ < 1 4 \rho<\frac{1}{4} r<41,则μ = 0.5μ \mu = 0.5\mum=0 . 5 m ;
  6. If ρ \rhoIf ρ is greater than a certain threshold, it is considered approximately feasible. Letxk + 1 = xk + Δ xk x_{k+1}=x_{k}+\Delta x_{k}xk+1=xk+Δx _k
  7. Determine whether the algorithm has converged. Return 2 if not converged, otherwise end.

  For optimization in the trust region, use Lagrange multipliers to transform into unconstrained: min ⁡ Δ xk 1 2 ∣ ∣ f ( xk ) + J ( xk ) Δ xk ∣ ∣ 2 + λ 2 ∣ ∣ D Δ x ∣ ∣ 2 \underset{\Delta x_{k}}{\min} \frac{1}{2} ||f(x_{k})+J(x_{k}) \Delta x_{k}||^{ 2}+\frac{\lambda }{2} ||D\Delta x||^{2}Δx _kmin21f(xk)+J(xk) Δx _k2+2lDΔx2 is still expanded with reference to the Gauss-Newton method, and the incremental equation is:( H + λ DTD ) Δ x = g (H+\lambda D^{T}D)\Delta x=g(H+λDTD)Δx=gIn the Levenberg method, take D=I, then:( H + λ I ) Δ x = g (H+\lambda I)\Delta x=g(H+λI)Δx=  Compared with GN, g LM can guarantee the positive definiteness of the incremental equation, that is, it believes that the approximation is only valid within a certain range, and if the approximation is not good, the range will be reduced; from the perspective of the incremental equation,Can be seen as a mixture of first-order and second-order, the parameter λ \lambdaλ controls the weights on both sides, if λ \lambdaλ is 0, thenH Δ x = g H\Delta x=gHΔx=g , that is, the second-order method Newton method; ifλ \lambdaIf λ is very large, the steepest descent method of the first-order method is used.

Three, BA

  First, what is the error? How to express. The variables to be optimized in BA are pose and landmark points. How to find the derivative of the error function with respect to pose and landmark points? What is the Lie algebraic perturbation model? What is the Jacobian matrix? What is the specific form in BA?
  Rotation matrix group and transformation matrix group: SO ( 3 ) = R ∈ R 3 × 3 ∣ RRT = I , det ( R ) = 1 SO(3)={R\in \mathbb{R}^{3\times 3 }|RR^{T}=I,det(R)=1}SO(3)=RR3×3RRT=I,d e t ( R )=1 S E ( 3 ) = { T = [ R T 0 T 1 ] ∈ R 4 × 4 ∣ R ∈ S O ( 3 ) , t ∈ R 3 } SE(3)=\left \{ T=\begin{bmatrix} R & T\\ 0^{T} &1 \end{bmatrix} \in \mathbb{R}^{4\times 4} |R\in SO(3),t\in \mathbb{R}^{3}\right \} S E ( 3 )={ T=[R0TT1]R4×4RSO(3),tR3 }hascontinuous smooth propertyThe group of is called a Lie group, and there is a problem: it is not closed for addition, and cannot be derived.
Please add a picture description
  There are two ways of seeking

  1. Add a small amount to the Lie algebra corresponding to R, and find the rate of change relative to the small amount ( derivative model );
  2. Multiply R by a small amount on the left or right, and find the rate of change of the Lie algebra relative to the small amount ( disturbance model ): Please add a picture description
    In the disturbance model (left multiplication), multiply the small amount on the left, and make its Lie algebra zero, get:
    Please add a picture description
    The derivative model of pose transformation is:
    Please add a picture descriptionPlease add a picture description

4. Graph optimization and g2o

Reprojection error:
Please add a picture description
Constructing a least squares problem on the error function:
Please add a picture description
Matrix form: siui = K exp ( ξ ∧ ) P i s_{i}u_{i}=Kexp(\xi ^{\wedge})P_{i}siui=K e x p ( ξ)Pi误差函数: ξ ∗ = a r g min ⁡ ξ 1 2 ∑ i = 1 n ∣ ∣ u i − 1 s i K e x p ( ξ ∧ ) P i ∣ ∣ 2 2 \xi^{*}=arg\underset{\xi }{\min}\frac{1}{2}\sum_{i=1}^{n}||u_{i}-\frac{1}{s_{i}}Kexp(\xi ^{\wedge })P_{i} ||_{2}^{2} X=argXmin21i=1nuisi1K e x p ( ξ)Pi22Write this error function as e ( x ) e(x)e ( x ) e ( x + Δ x ) = e ( x ) + J ( x ) Δ xe(x+ \Delta x)=e(x)+J(x)\Delta xand ( x+Δ x )=and ( x )+J(x)Δx相机模型: u = f x X ′ Z ′ + c x , v = f y Y ′ Z ′ + c y u=f_{x}\frac{X'}{Z'}+c_{x},v=f_{y}\frac{Y'}{Z'}+c_{y} u=fxZX+cx,v=fyZY+cyUse the disturbance model to derive:
Please add a picture description
Please add a picture description
Please add a picture description
Please add a picture description
we multiply the two derivatives obtained:
Please add a picture description
this matrix is ​​the Jacobian matrix we obtained, which guides the direction of iteration during optimization, so that the pose of the camera is optimized.

Spatial point locations of optimized features:
Please add a picture description

Please add a picture description
Please add a picture description
This matrix is ​​the Jacobian matrix when optimizing the spatial position of feature points, which guides the direction of iteration.
Please add a picture description

Guess you like

Origin blog.csdn.net/j000007/article/details/125132556