李群与李代数对SLAM移动场景的运动坐标表示

Representing a moving scene.

一些必须知识点:skew matrix 是 s o ( 3 ) so(3) so(3)李代数中的,对应到 S O ( 3 ) SO(3) SO(3)李群中的旋转矩阵 R R R; twist matrix 是 s e ( 3 ) se(3) se(3)李代数中的,对应到 S E ( 3 ) SE(3) SE(3)李群中的变换矩阵 T T T。一些基本的知识点,skew matrix如何表达等,读者也应该了解。

  本篇博文,将介绍,李群与李代数的关系,如何利用李群与李代数,构造便于表示的旋转/变换矩阵,来表示移动的场景和目标。

  核心内容是:传统的旋转矩阵 R R R和变换矩阵 T T T,如果要在欧式空间进行构造,那么构造这个矩阵非常复杂,需要9个参数描述旋转,12个参数描述变换(9个旋转+3个平移)。而且,在欧式空间中,相机的移动,所依赖的变换矩阵,不是时间的变量,每次移动与上一次移动,其变换矩阵建立不起时间的联系,即微分方程。

  因此,采用李代数,在tangent space中,构造skew matrix: w ∧ w^{\wedge} w或者 twist matrix: ξ ∧ \xi^{\wedge} ξ,只需要极少的参数,skew matrix 只需要3维,twist matrix只需要6个参数。通过指数映射回李群,获得旋转矩阵/变换矩阵。从求解计算复杂度和参数的个数,都能明显的下降。

  此外,相机的移动,对于李群中的矩阵,采用乘法来表示位姿点变化 X 1 = T X 0 X_1=TX_0 X1=TX0。而在李代数中,可以通过skew matrix 和 twist matrix来表示变换的相对速度,即当前时刻的3D坐标对时间的倒数,表示位姿(可以是相机的,可以是世界坐标系中目标点)的变换速度。任何的变换,在无穷小时间上看,都是微小的旋转/平移,因此,可以很方便的计算速度和位姿。

 下面,是通过对李代数的介绍,指数映射,对数映射,刚体的旋转,刚体的变换(旋转+平移),随后给定参考坐标 X 0 X_0 X0,移动相机,对场景进行拍摄,介绍如何通过李代数来表示场景的移动过程,参考的坐标点,在过程中,是如何随着相机移动变换的。
(这里不涉及到具体过程:如特征点提取,特征点匹配,ORB算法等,建立匹配点关系,求得变换矩阵等等。)

  相对于前一篇《机器人与视觉——李群与李代数,李括号性质的分析与证明》的博文,本文用文字来描述李群与李代数,及其对应的相机移动的运动表示应用。个人感觉这种方法,更能直观的理解,李群与李代数的原理,和为何选取这种表达方法来简化机器人、相机位姿求解问题。

1. Lie group and Lie algebra.

  Using Lie algebra, we do not need to construct a complicated transformation matrix, which consists of a 3x3 rotation matrix and a 3x1 translation vector. R is a rotation matrix satisfied by R*RT=I. We only need 6 parameters, v and w. And the matrix becomes a 4x4 twist matrix.

  Taking the tangent space, and modeling the elements in Lie group (Rotation, Transformation) by corresponding the element in the tangent space.

1.1 s o ( 3 ) so(3) so(3)-> S O ( 3 ) SO(3) SO(3), only for rotation.

1.1.1 The exponential map.

s o ( 3 ) so(3) so(3)-> S O ( 3 ) SO(3) SO(3) . The skew matrix w ∧ , w ∈ R 3 w^{\wedge}, w\in\mathbb{R}^3 w,wR3 in Lie algebra s o ( 3 ) so(3) so(3) is corresponding to the rotation matrix R ∈ R 3 × 3 R\in\mathbb{R}^{3\times3} RR3×3 in the Lie group S O ( 3 ) SO(3) SO(3). We can use exponential map applied to the skew matrix w ∧ w^{\wedge} w , then, we can get the R R R. (We can use Rodrigues’ formula to compute the R R R.)
e x p : s o ( 3 ) − > S O ( 3 ) ; w ∧ − > e w ∧ . exp: so(3)->SO(3); \quad w^{\wedge}->e^{w^{\wedge}}. exp:so(3)>SO(3);w>ew.
So, through the exponential map, we have:
R = e w ∧ = I + s i n ( θ ) n ∧ + ( 1 − c o s θ ) n ∧ n ∧ . R=e^{w^{\wedge}}=I+sin(\theta)n^{\wedge}+(1-cos\theta)n^{\wedge}n^{\wedge}. R=ew=I+sin(θ)n+(1cosθ)nn.
  By introducing Lie algebra, we don’t need to explicitly construct a rotation matrix R R R with so many constraints. The 9 parameters in rotation matrix $R $ can be represented by 3 parameters in w w w. Then, applied the exponential map to its skew matrix. We can get R R R. These constraints are:
R T R = I , d e t ( R ) = 1 ; r 1 ∗ r 2 = r 2 ∗ r 3 = 0 , R^TR=I, det(R)=1;r_1*r_2=r_2*r_3=0, RTR=I,det(R)=1;r1r2=r2r3=0,
each component is orthogonal to another in the rotation R R R.

1.1.2 The Logarithm map.

  One can also use the Logarithm of S O ( 3 ) SO(3) SO(3), to map the rotation matrix into s o ( 3 ) so(3) so(3), and get the w w w. Typically, we use axis-angle to represent the w = θ n w=\theta\boldsymbol n w=θn, where θ \theta θ is the rotation angle, and n \boldsymbol n n is the rotation axis. This means a rotation around the axis n \boldsymbol n n by an angle of θ \theta θ (if ∥ n ∥ = 1 \Vert n\Vert=1 n=1):
θ = a r c c o s ( t r a c e ( R ) − 1 2 ) \theta=arccos(\frac{trace(R)-1}{2}) θ=arccos(2trace(R)1)

n = 1 2 s i n ( θ ) ( r 32 − r 23 r 13 − r 31 r 21 − r 12 ) \boldsymbol n=\frac{1}{2sin(\theta)}\left(\begin{matrix} r_{32}-r_{23}\\ r_{13}-r_{31}\\ r_{21}-r_{12}\\ \end{matrix}\right) n=2sin(θ)1r32r23r13r31r21r12

1.2 s e ( 3 ) se(3) se(3)-> S E ( 3 ) SE(3) SE(3), for rigid-body motion.

  The motion of a rigid-body is determined by specifying the translation T T T to any given point, and a rotation matrix R R R to rotate the coordinate frame at the given point.

  The 4 × 4 4\times 4 4×4 matrix ξ ∧ \xi^{\wedge} ξ is a twist (You can rotate and translate the rigid-body at the same time, which resulted in a twist effect in the processing, the effect figure is illustrated below). The twist in Lie algebra s e ( 3 ) se(3) se(3) at the tangent space of origin corresponding to the Lie group S E ( 3 ) SE(3) SE(3).

The effect of twist.
在这里插入图片描述
The effect of twist. Twist的效果就是,又有旋转,也有平移,他的轨迹是扭曲的。

  We can also use a twist matrix ξ ∧ ∈ s e ( 3 ) \xi^{\wedge}\in se(3) ξse(3) or its twist coordinate ξ ∈ R 6 \xi\in \mathbb{R}^6 ξR6, w w w is the skew matrix, and v v v is the 3D vector.
ξ ∧ = ( v w ) ∧ = ( w ∧ v 0 0 ) ∈ R 4 × 4 , \xi^{\wedge}=\left(\begin{matrix} v\\ w\\ \end{matrix}\right)^{\wedge}= \left(\begin{matrix} w^{\wedge} & v\\ 0 & 0\\ \end{matrix}\right)\in \mathbb{R}^{4\times4}, ξ=(vw)=(w0v0)R4×4,

ξ = ( w ∧ v 0 0 ) ∨ = ( v w ) ∈ R 6 , \xi =\left(\begin{matrix} w^{\wedge} & v\\ 0 & 0\\ \end{matrix}\right)^{\vee}= \left(\begin{matrix} v\\ w\\ \end{matrix}\right)\in \mathbb{R}^6, ξ=(w0v0)=(vw)R6,

s e ( 3 ) − > S E ( 3 ) se(3)->SE(3) se(3)>SE(3), using the twist matrix ξ ∧ \xi^{\wedge} ξ.
e x p : s e ( 3 ) − > S E ( 3 ) ; ξ ∧ − > e ξ ∧ . exp: se(3)->SE(3); \quad \xi^{\wedge}->e^{\xi^{\wedge}}. exp:se(3)>SE(3);ξ>eξ.

g = ( R , T ) = e x p ( ξ ∧ ) . g=(R,T)=exp(\xi^{\wedge}). g=(R,T)=exp(ξ).

​  The 12 parameters in the transformation matrix g g g now only need 6 parameters to represent, which means 3 rotation freedom and 3 translation freedom.

Given: g = ( R , T ) ∈ S E ( 3 ) g=(R,T)\in SE(3) g=(R,T)SE(3), there exist many twist coordinates ξ = ( v , w ) ∈ R 6 \xi=(v,w)\in \mathbb{R}^6 ξ=(v,w)R6 such that g = e x p ( ξ ∧ ) g=exp(\xi^{\wedge}) g=exp(ξ).

Proof: the skew matrix can be computed through the rotation matrix e w ∧ = R e^{w^{\wedge}}=R ew=R, which is the same one in S O ( 3 ) − > s o ( 3 ) SO(3)->so(3) SO(3)>so(3). Once we know w w w, the velocity vector v ∈ R 3 v\in\mathbb{R}^3 vR3 could also be computed by solving the equation:
( I − e w ∧ ) w ∧ v + w w T v ∣ w ∣ 2 = T . \frac{(I-e^{w^{\wedge}})w^{\wedge}v+ww^Tv}{\vert{w}\vert^2}=T. w2(Iew)wv+wwTv=T.
  As described above, when given the twist matrix ξ ∧ \xi^{\wedge} ξ, we can directly compute the transformation matrix (either exponential map or Rodrigues’ formula is ok.):
g ( t ) = e ξ ∧ = ( e w ∧ ( I − e w ∧ ) w ∧ v + w w T v ∣ w ∣ 2 0 1 ) = ( R T 0 1 ) . g(t)=e^{\xi^{\wedge}}=\left(\begin{matrix} e^{w^{\wedge}} & \frac{(I-e^{w^{\wedge}})w^{\wedge}v+ww^Tv}{\vert{w}\vert^2}\\ 0&1 \end{matrix}\right)= \left( \begin{matrix} R & T\\ 0&1 \end{matrix} \right). g(t)=eξ=(ew0w2(Iew)wv+wwTv1)=(R0T1).
We can use Rodrigues’ formula to compute R R R. After that, we can compute T T T, since w , v w, v w,v are known.

2. The motion of the camera.

  When observing a scene from a moving camera, the coordinates and velocity of a point in the camera coordinate will change over time. At this point, we use rigid body transformation to represent the motion from a fixed world frame to the moving camera frame at time t t t.
X ( t ) = g ( t ) X 0 , X(t)=g(t)X_0, X(t)=g(t)X0,
where X 0 X_0 X0 is the point in the world coordinate. This transformation models the point’s change over time t t t.

​  Why do we use Lie group and Lie algebra, when using the traditional method, every time the camera moves, we need to compute a new rotation and translation matrix to represent the new transformation T T T. And the matrix T T T change over time is not differentiable, not summable, every change of the camera pose is independent in the S E ( 3 ) SE(3) SE(3). But the motions of the camera at infinitesimal is a continuous procedure, which means the camera is rotated infinitesimally and translated infinitesimally. We can use velocity and movement to represent it: X ( d t ) = I + g ˙ ∗ d t X(dt)=I+\dot{g}*dt X(dt)=I+g˙dt.

2.1 Concatenation of motions over frame.

​  The transformation from points in frame at t 1 t_1 t1 to the points in frame at t 2 t_2 t2 by transformation is g ( t 2 , t 1 ) g(t_2,t_1) g(t2,t1):
X ( t 2 ) = g ( t 2 , t 1 ) X ( t 1 ) . X(t_2)=g(t_2,t_1)X(t_1). X(t2)=g(t2,t1)X(t1).
So, at t 3 t_3 t3, we have:
g ( t 3 , t 1 ) = g ( t 3 , t 2 ) g ( t 2 , t 1 ) . g(t_3,t_1)=g(t_3,t_2)g(t_2,t_1). g(t3,t1)=g(t3,t2)g(t2,t1).
By transferring the coordinate of t 1 t_1 t1 to t 2 t_2 t2 and back, we have:
X ( t 1 ) = g ( t 1 , t 2 ) X ( t 2 ) = g ( t 1 , t 2 ) g ( t 2 , t 1 ) X ( t 1 ) . X(t_1)=g(t_1,t_2)X(t_2)=g(t_1,t_2)g(t_2,t_1)X(t_1). X(t1)=g(t1,t2)X(t2)=g(t1,t2)g(t2,t1)X(t1).
Thus, we have:
g ( t 1 , t 2 ) g ( t 2 , t 1 ) = I < = > g ( t 1 , t 2 ) = g ( t 2 , t 1 ) − 1 . g(t_1,t_2)g(t_2,t_1)=I <=> g(t_1,t_2)=g(t_2,t_1)^{-1}. g(t1,t2)g(t2,t1)=I<=>g(t1,t2)=g(t2,t1)1.

2.2 Rules of velocity transformation.

  At time t t t, X ( t ) = g ( t ) X 0 X(t)=g(t) X_0 X(t)=g(t)X0, how it changes over time t t t is the velocity:
X ˙ ( t ) = g ˙ ( t ) X 0 = g ˙ ( t ) g − 1 ( t ) X ( t ) . \dot{X}(t)=\dot{g}(t)X_0=\dot{g}(t)g^{-1}(t)X(t). X˙(t)=g˙(t)X0=g˙(t)g1(t)X(t).
By introducing the twist coordinates:
V ∧ ( t ) = g ˙ ( t ) g − 1 ( t ) = ( w ∧ ( t ) v ( t ) 0 0 ) ∈ s e ( 3 ) . V^{\wedge}(t)=\dot{g}(t)g^{-1}(t)=\left(\begin{matrix} w^{\wedge}(t) & v(t)\\ 0 & 0\\ \end{matrix}\right)\in se(3). V(t)=g˙(t)g1(t)=(w(t)0v(t)0)se(3).
So, we have:
X ˙ ( t ) = V ∧ ( t ) X ( t ) , \dot{X}(t)=V^{\wedge}(t)X(t), X˙(t)=V(t)X(t),
this indicates the velocity of points in the camera frame. V ∧ ( t ) V^{\wedge}(t) V(t) is the relative velocity of the world coordinate frame as viewed from the camera frame. (Camera is fixed, wathcing the object in the world transformation.)

2.3 The adjoint map.

  How to view one point in another camera frame B B B in the Lie algebra? Suppose camera frame B B B to current frame A A A is displaced by a transformation g x y : Y ( t ) = g x y X ( t ) g_{xy}:Y(t)=g_{xy}X(t) gxy:Y(t)=gxyX(t) .

Then, the velocity in the new frame is:
Y ˙ ( t ) = g x y X ˙ ( t ) = g x y V ∧ ( t ) X ( t ) = g x y V ∧ ( t ) g x y − 1 Y ( t ) . \dot{Y}(t)=g_{xy}\dot{X}(t)=g_{xy}V^{\wedge}(t)X(t)=g_{xy}V^{\wedge}(t)g_{xy}^{-1}Y(t). Y˙(t)=gxyX˙(t)=gxyV(t)X(t)=gxyV(t)gxy1Y(t).
The relative velocity of points observed from another camera frame B B B is represented by the twist:
V y ∧ = g x y V ∧ g x y − 1 ≡ a d g x y ( V ∧ ) . V^{\wedge}_y=g_{xy}V^{\wedge}g_{xy}^{-1}\equiv ad_{g_{xy}}(V^{\wedge}). Vy=gxyVgxy1adgxy(V).
So, observing the scene from camera B B B’s view, and the scene is transformed by camera A A A, is called the adjoint map on s e ( 3 ) se(3) se(3):
a d g : s e ( 3 ) − > s e ( 3 ) ; ξ ∧ − > g ξ ∧ g − 1 . ad_g: se(3)->se(3); \xi^{\wedge}->g\xi^{\wedge}g^{-1}. adg:se(3)>se(3);ξ>gξg1.
  An adjoint map is used to model the rotation/transformation between frames or cameras in Lie algebra. Based on the adjoint map, one needn’t construct a difficult matrix to map the object in the world coordinate to a new view frame.

3. Summary.

  We summarize the skew matrix and twist matrix in Lie algebra and their corresponding matrixs in the Lie group, respectively.
在这里插入图片描述

4. References.

1. Chapter 2 - Representing a Moving Scene.pdf
2. Multiple View Geometry - Lecture 3, Technical University of Munich.
3. 机器人与视觉——李群与李代数,李括号性质的分析与证明。

猜你喜欢

转载自blog.csdn.net/qq_32998593/article/details/124801605