Camera-IMU joint calibration principle


In the VIO system, whether the internal and external parameters of the camera-imu are accurate or not plays an important role in the overall positioning accuracy. Therefore, a good calibration result is a prerequisite for the positioning system.

At present, the calibration algorithm is mainly divided into offline and online calibration. Offline calibration is represented by kalibr, which can calibrate the internal parameters of the camera, the displacement and rotation between the camera and the imu, the time delay, the scale coefficient of the imu itself, and non-orthogonality. During the calibration process, kalibr will use apriltag as the visual benchmark, and use the nonlinear optimization method to optimize the visual reprojection error, angular velocity error, acceleration error, and bias error, and finally obtain the required external parameters of the IMU and camera, as well as the size of the gravity vector. In addition, kalibr also considers the situation that the camera and IMU are not synchronized. Therefore, kalibr can also be used to obtain the time delay between the camera and the IMU.

1. Camera projection model

Here, the commonly used pinhole camera projection is taken as an example, and the distortion models include the most common rantan and equidistant. How to project a 3D point in space to get the 2D coordinates of the pixel plane:

insert image description here

From the figure above, it can be seen that the three-dimensional point P in space is similar to a triangle, and the pixel plane coordinates (u, v) can be simply calculated. The actual projection imaging process is not as simple as the above picture, it mainly includes three steps:
​[ XCYCZC ] ⟹ normalized plane [ u = XC / ZC v = YC / ZC ] ⟹ add distortion [ r = u 2 + v 2 dr = 1 + k 1 r + k 2 r 2 + k 3 r 2 ud = udr + 2 uvp 1 + ( r + 2 u 2 ) p 2 vd = vdr + 2 uvp 2 + ( r + 2 v 2 ) p 1 ] ⟹ Pixel [ px = fxud + cxpy = fyvd + cy ] \begin{bmatrix} X_{C}\\Y_{C}\\ Z_{C}\end{bmatrix}\overset{normalized plane}{\Longrightarrow } \begin{bmatrix} u=X_{C}/Z_{C}\\v=Y_{C}/Z_{C}\end{bmatrix}\overset{plus distortion}{\Longrightarrow } \begin{bmatrix} r =u^{2}+v^{2}\\d_{r}=1+k_{1}r+k_{2}r^{2}+k_{3}r^{2}\\u_{ d}=ud_{r}+2uvp_{1}+(r+2u^{2})p_{2}\\ v_{d}=vd_{r}+2uvp_{2}+(r+2v^{2 })p_{1}\end{bmatrix}\overset{pixel}{\Longrightarrow }\begin{bmatrix} p_{x}=f_{x}u_{d}+c_{x}\\p_{y}= f_{y}v_{d}+c_{y} \end{bmatrix} XCYCZC Normalized plane[u=XC/ZCv=YC/ZC]Add distortion r=u2+v2dr=1+k1r+k2r2+k3r2ud=udr+2 UV p1+(r+2 and2)p2vd=vdr+2 UV p2+(r+2v _2)p1 pixel[px=fxud+cxpy=fyvd+cy]

The second step in the above figure takes the rantan distortion as an example, and the equidistant distortion model is another form:

{ r = u 2 + v 2 θ d = θ ( 1 + k 1 θ 2 + k 2 θ 4 + k 3 θ 6 + k 4 θ 8 ) ud = u ⋅ θ dr , vd = v ⋅ θ dr \ left\{\begin{matrix} r=\sqrt{u^2+v^2} \\ \theta_{d}=\theta(1+k_{1}\theta^{2}+k_{2}\ theta^{4}+k_{3}\theta^{6}+k_{4}\theta^{8}) \\u_{d}=u\cdot \frac{\theta_{d}}{r} ,v_{d}=v\cdot \frac{\theta_{d}}{r}\end{matrix}\right. r=u2+v2 id=i ( 1+k1i2+k2i4+k3i6+k4i8)ud=urid,vd=vrid

The projection process is to first calculate the point on the normalized plane, then add rantan and equidistant distortion to the point on the normalized plane, and finally act on the internal reference focal length and principal point to obtain the coordinates of the imaging pixel point.

2. IMU model

The inertial measurement unit (Inertial Measurement Unit, IMU) is mainly used to measure the angular velocity and acceleration during the movement of the robot . Because the pictures taken by the camera will be blurred during the process of moving too fast, resulting in fewer common areas in the two frames of pictures, and the algorithm cannot determine the direction of the robot's movement. Therefore, using an IMU can consistently provide reliable (R,t) estimates. For vision algorithms, the IMU plays a complementary role.

For an ideal IMU, the three axes of its acceleration and the three axes of the gyroscope define a shared, orthogonal three-dimensional coordinate system. The accelerometer is used to sense the acceleration of the object along different axes during the motion , and the gyroscope is used to measure the angular velocity of the object around different axes . But in the actual situation, due to assembly reasons, the three-dimensional coordinate systems of the accelerometer and gyroscope are two non-orthogonal, resulting in an error in the axis deflection angle, as shown in the figure below. For the transformation between two coordinate systems, it can be obtained by the following formula:

insert image description here

​ Non-orthogonal sensor coordinate axes ( x S , y S , z S ) (x^{S}, y^{S}, z^{S})(xSySzS )and IMU body coordinate axes( x B , y B , z B ) (x^{B}, y^{B}, z^{B})(xByBzB)

Assuming that the body coordinate system of the IMU coincides with the orthogonal coordinate system of the accelerometer, then in this case, the angles β xz , β xy , β yx β_{xz}, β_{xy}, β_{yx}bxzbxybyxThe values ​​are all zero, the above formula can be simplified as:

insert image description here

In the formula, the letter a represents the condition of the accelerometer, and a O a^{O}aOha S a^{S}aS represent the ideal coordinate system and the actual coordinate system, respectively.

Gyroscope and accelerometer measurements should be referenced to the same coordinate system, so the following equation can be used to express how to transform from the actual gyroscope coordinate system to the ideal gyroscope coordinate system:

insert image description here

Both the accelerometer and the gyroscope will be affected by the error caused by the conversion of the digital signal into a physical quantity, so the scale factor matrix of the accelerometer and the gyroscope is introduced:
insert image description here

Under the condition of precise machining of the accelerometer and gyroscope, when the IMU is stationary, the three axes of the accelerometer and gyroscope output 0. However, due to the existence of processing errors, the accelerometer and gyroscope produce a zero bias vector:

insert image description here

In the case of the joint use of the camera and the IMU, the main consideration is how to convert the data from one coordinate system to another. The relationship between the camera coordinate system {C}, the IMU coordinate system {B} and the world coordinate system {W} is as follows As shown in the figure, each coordinate system can be transformed into each other:

insert image description here

​ The relationship between mutual conversion between different coordinate systems

Use T to represent the transformation between different coordinate systems. Before the camera and IMU are used, the transformation matrix between the camera and IMU needs to be T cb T_{cb}TcbCalibrate. Therefore, the following transformation relationship is satisfied between the camera coordinate system and the IMU coordinate system:

T w b = T w c ⋅ T c b T_{wb}=T_{wc}\cdot T_{cb} Twb=TwcTcb

Express T as a rotation matrix R and a translation vector t:

insert image description here

Applying the above formula to the operation between the matrices, the transformation relationship of rotation and translation between the camera coordinate system and the IMU coordinate system can be obtained respectively:

insert image description here

​ Since the camera and the IMU are two different sensors, the triggering of the two cannot be completely synchronized, so there is a time difference td t_{d} between the triggering of the camera and the IMU in usetd. As shown in the figure below, there is a skew between sensor sampling instances and timestamps.

Sensor sampling timestamp

Sensor sampling timestamp

​ The sampling time of the IMU is equal to the sampling time of the camera plus the time difference td t_{d}td: t I M U = t c a m + t d t_{IMU}=t_{cam}+t_{d} tIMU=tcam+td

3. Camera-IMU calibration model

For environments with low texture, motion blur, and occlusions, the combination of vision and inertial sensors is robust. Scale information is usually missing in pure visual SLAM , and IMU recovers absolute scale by providing reliable R and t . Representative algorithms include EKF-SLAM, MSCKF, OKVIS, VINS-Mono and ORB-SLAM3 . The above classic algorithms all rely on accurate initialization of the system and precise calibration of the camera-IMU . The external parameter calibration of the camera-IMU is mainly to calibrate the transformation matrix (ie rotation and translation) from the camera coordinate system {C} to the IMU coordinate system {B}. We first obtain the rotation from the camera to the IMU, and then obtain the translation .

(1) Camera-IMU rotation

The rotation between the camera-IMU is very important for the robustness of the visual-inertial SLAM system, and too large deviation will cause the initialization of the system to crash. Because the monocular camera can track the pose of the system, and the relative rotation between two images can be solved by the classic five-point algorithm R ck , ck + 1 R_{c_{k}, c_{k+1}}Rck,ck+1. Additionally, the angular velocity can be obtained by integrating the gyroscope with the associated rotation R bk , bk + 1 R_{b_{k},b_{k+1}}Rbk,bk+1. For any picture of k frames, follow the following equation:

R b k , b k + 1 ⋅ R b c = R b c ⋅ R c k , c k + 1 R_{b_{k},b_{k+1}}\cdot R_{bc}=R_{bc}\cdot R_{c_{k},c_{k+1}} Rbk,bk+1Rbc=RbcRck,ck+1

Represent the rotation matrix in the above formula with quaternions:

insert image description here

For given rotations between pairs of consecutive images, we can construct an overdetermined equation:

insert image description here

where N represents the number of frames used when the rotation matrix converges, wk , k + 1 w_{k,k+1}wk,k+1is the weight of exception handling. When the rotation calibration is run with the incoming measurements, the previously estimated result R^bc \hat{R}_{bc}R^bcThe residuals can be weighted as initial values:

insert image description here

The weight of the residual function is:

insert image description here

If there are not enough features to estimate the camera rotation, then wk , k + 1 w_{k,k+1}wk,k+1Set to zero. The solution to the above overdetermined equation can be found corresponding to QN Q_{N}QNThe right unit singular vector of the smallest singular value of . That is, the corresponding rotation matrix R bc R_{bc} can be solvedRbc

(2) Camera-IMU translation

The rotation of the camera-IMU can be obtained from the above part to obtain the rotation matrix R bc R_{bc}RbcAfter that, we can estimate the camera-IMU translation, using a tightly coupled sliding window to initialize these parameters. The initialization is completed in the IMU coordinate system, and the state vector we define is:

insert image description here

where xk x_{k}xkis the state of the kth IMU, gbkg^{bk}gbk is the gravity vector, N is the number of IMU states in the sliding window s, and M is the number of features with enough parallax in the sliding window. n and m are the starting indices in the sliding window. λ l λ_{l}llis the llThe depth of l features from the first observation,P b 0 bk P_{b_{0}b_{k}}Pb0bk=[0,0,0] is set in advance.

Initialization is done by maximum likelihood estimation, which minimizes the sum of the Mahalanobis norms of all measurement errors from the IMU and monocular camera within a sliding window:

insert image description here

where B is the set of all IMU measurements and C is the set of all observations between any feature and any camera pose. Since the increment and relative rotation are known, the above equation can be solved in a non-iterative linear manner. The measurement of the IMU can be expressed as:

insert image description here
{ r p , H p r_{p},H_{p} rp,Hp} is the solution of the linear estimator, { P bk , bk + 1 , H bk , bk + 1 P_{b_{k},b_{k+1}},H_{b_{k},b_{k+1} } Pbk,bk+1,Hbk,bk+1} is the linear IMU measurement model, H cjl H_{cjl}Hc j lis the linear camera measurement model. The optimization equation described above is too complicated. In order to facilitate the solution of parameters, the linear cost function is converted into the following form:

insert image description hereWhere {$ \Lambda_{B},b_{B} KaTeX parse error: Expected 'EOF', got '}' at position 1: }̲ and { \ Lambda_{C},b_{C}$} are IMU and Information matrices and vectors for visual measurements. Due to the known increments and corresponding rotations, the state of the cost function is linear and the above has a unique solution. By solving the equation linearly, the translation vector tbc t_{bc} between the camera and IMU can be obtainedtbc

(3) Visual inertia cost function

​ The monocular visual-inertial problem can be formulated as a jointly optimized cost function J ( x ) J(x)J ( x ) , the cost function includes residual weightsev e_{v}evand the residual weights ei e_{i} of the inertial measurementsei:

insert image description here

where i is the feature index of the image, k represents the camera number index, j represents the position of the 3D object, and W represents the information matrix of the position measurement.

4. Camera-imu joint calibration

​imu can obtain the acceleration and angular velocity at each moment, and the speed, position, and rotation can be obtained by integrating the acceleration and angular velocity. Different from the integration of discrete imu data in SLAM to obtain the state may bring a large error, the time-continuous state derivation is used to reverse the imu data. B-spline is needed to describe the discrete state as continuous.

Kalibr offline calibration method steps:

  1. Roughly estimate the time delay between camera and imu.

  2. Get the initial rotation between imu-camera, and some necessary initial values: gravity acceleration, gyroscope bias.

  3. Large optimization, nonlinear optimization cost function, including all corner reprojection errors, imu accelerometer and gyroscope measurement errors, bias random walk noise.

(1) Roughly estimate the time delay between camera and imu

Through camera internal reference calibration, the internal reference of the camera can be calibrated first. Now that the 3D-2D correspondence of each frame of image is known, the pose of each frame of camera can be calculated. By using these discrete poses to construct a continuous B-spline, you can get poses at any time.

Here, six-dimensional column vectors are used for pose parameterization, respectively, three-dimensional displacement and rotation vectors. The velocity and acceleration can be obtained by calculating the first and second derivatives of the displacement and rotation vectors:
v = t ˙ v = \dot{t}v=t˙
a = t ˙ a = \dot{t} a=t˙
w i = J r ( θ ) θ ˙ w_{i} = J_{r}(\theta)\dot{\theta} wi=Jr( i )i˙

​Use the spline curve of the camera to obtain the angular velocity of the camera rotation at any time, and the gyroscope measures the angular velocity of the imu. Neglecting the influence of bias and noise, the difference between the two is one rotation, and the modulus length is equal: wi = R icwc w_{i}=R_{ic}w_{c}wi=Ricwc

(2) Obtain the initial rotation between imu-camera, and some necessary initial values: gravity acceleration, gyroscope bias

Also use the angular velocity measurement relationship, this time to construct an optimization problem

In this way, the rotation between the camera-imu and the initial value of the gyroscope bias can be obtained.

Neglecting acceleration bias and noise, assuming that the average acceleration is zero during the entire calibration process, the representation of gravity acceleration in the reference coordinate system can also be obtained: g ^ = − 1 n ∑ i = 1 n R wc ∗ R ci ∗ am \ hat{g} =-\frac{1}{n} \sum_{i=1}^{n} R_{wc}*R_{ci}*a_{m}g^=n1i=1nRwcRciam

(3) Great optimization, including all corner reprojection errors, imu accelerometer and gyroscope measurement errors, bias random walk noise

The first two steps provide a good initial value for the final big optimization, and then the big optimization is to adjust all the variables to be optimized to minimize all observation errors.

The error term includes all calibration board corner reprojection errors , imu accelerometer and gyroscope measurement errors , biased random walk noise (relative to special points).

In order to simplify the construction of the imu measurement error, the camera pose is multiplied by the external parameters calculated above to obtain the pose curve of the imu. Of course, this curve may have a relatively large error, which will be adjusted in the subsequent optimization process.

reference:

https://blog.csdn.net/Yong_Qi2015/article/details/117490261

Guess you like

Origin blog.csdn.net/j000007/article/details/128942166