Digging into the technical details of binocular vision positioning-calibration function

       Speaking of binocular positioning, I believe that many people, like me now, will use ready-made formulas, understand some definitions, and understand the meaning of those parameters, but in depth, it is estimated that even Professor Zhang Zhengyou's calibration algorithm has not been understood. Because it involves some matrix calculation skills, there is no mathematical foundation, or the foundation is not strong, it is impossible to understand in depth. I have been engaged in binocular vision for two years. I used to generate a depth map or perform binocular ranging. Copy and paste the code on the Internet and modify it. But now, if you want to develop high-precision binocular vision positioning, you must deep Dig down and figure out all aspects of the imaging light path. Here, I provide some blogs that summarize better online to pave the way for new friends: https://www.cnblogs.com/zyly/p/9366080.html  ; this blog not only systematically explains the imaging coordinate process, but also A series of reference blogs are listed at the end of the blog post, thank him for putting it! " Big Ultraman fights little monsters ";

           Today, let’s talk about lens distortion first, because there is no lens without distortion, but the magnitude of the distortion is different. For example, if you use a 4.3mm camera (small USB camera), it is basically a flat lens. You have to consider lens distortion. If you are engaged in depth maps and distance measurement, it is basically fine. Those are all low precision. It doesn't matter if the error is large. It's great to use Zhang Zhengyou's calibration method or Matlib double target to determine the accuracy. But if you want to do precise measurement and entangle the accuracy of mm and, those methods will always not satisfy you. For example, you can’t satisfy me now! What to do, dig into the technical details and find the root!

           The greatness of Zhang's calibration method is that the calibration process can be simplified to print a checkerboard of A4 paper to obtain relatively high accuracy. You can't still paste A4 paper in the calibration field of the photogrammetric camera! However, we can still use the Zhang Zhengyou calibration method! However, this method is used to deal with the calibration problem with more serious consideration of various factors. This is the meaning of method innovation!

Having said so many extra words, just to throw two camera calibration functions: OPENCV function:

【A】CV_EXPORTS_AS(calibrateCameraExtended) double calibrateCamera( InputArrayOfArrays objectPoints,
                                     InputArrayOfArrays imagePoints, Size imageSize,
                                     InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
                                     OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs,
                                     OutputArray stdDeviationsIntrinsics,
                                     OutputArray stdDeviationsExtrinsics,
                                     OutputArray perViewErrors,
                                     int flags = 0, TermCriteria criteria = TermCriteria(
                                        TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON) );

This [A] function is a function that supports analysis and calibration, and the evaluation results are given in the output parameters.
[B] CV_EXPORTS_W double calibrateCamera( InputArrayOfArrays objectPoints,
                                     InputArrayOfArrays imagePoints, Size imageSize,
                                     InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
                                     OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs,
                                     int flags = 0, TermCriteria criteria =
                                        TermCriteria: + TermCriteria: + TermCriteria 30 DBL_EPSILON) );

This [B] function is only functionally different from [A].

The following are some introductions on how to use these two functions, directly translated from the OPENCV function, there are errors in the translation, here is my understanding:

1.InputArrayOfArrays objectPoints

This objectPoints is the world coordinates of the corner points of the chessboard used, of course, this is given by the calibration board you use. Of course, if you have to build your own calibration rules, note that its data type is std::vector <std::vector <cv::Vec3f >>, which means you can generate as many times as you take photos, and the z coordinates are all 0; 2.InputArrayOfArrays imagePoints

This imagePoints, as the name implies, is to extract the image point coordinates of the corner points of the checkerboard in the image. Note that this is a data type of std::vector <std::vector <cv::Vec2f >>, this gives you a picture of which coordinates are compared. One to one correspondence

3. InputOutputArray cameraMatrix

This can be said to be the parameter matrix inside the camera, which can be initialized or not, and is related to the calculation mode you choose later.

4.InputOutputArray distCoeffs

This is the focus of today, the distortion parameter. It can be 4, 5, 8, 12 or 14 elements. Generally everyone uses 5 parameters, and the general accuracy is enough. As for whether the more parameters are better for high precision, some people say that it is easy to be unstable. I can tell you for sure that the more parameters for describing a model, the better. Think about the current deep learning, that The network model parameters are not thousands! However, the more parameters you have, the more images you need to take, and your calibration field needs to be built well! If you use A4 paper as the calibration board, you will need to set so many parameters. Think about it? Your data quality is not structured, how to adjust and solve the equation! Therefore, this place is a watershed, but for general computer vision requirements, 5 parameters are almost the same.

5. OutputArrayOfArrays rvecs,6.OutputArrayOfArrays tvecs

These two parameters are the rotation and translation parameters of the camera with the calibration board as a reference during the calibration process. Think about the approximate direction of the lens that Matlib will finally give you during the calibration process.

 7.OutputArray stdDeviationsIntrinsics 8.OutputArray stdDeviationsExtrinsics 9.OutputArray perViewErrors

These parameters are an evaluation of the results of this calibration. The smaller the value, the better your calibration.

10. int flags = 0,

The flags parameter is generally set to 0, and the trick is here. Ordinarily, precision calibration is required, and calibration is usually required several times. Control some high-confidence parameters unchanged, and then solve others. If the distortion parameter is not a 5-parameter model, this setting refers to the function explanation.

11.TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON)

This setting is a calculation iteration setting and generally does not change. If you have high-precision calibration, you should also study it carefully. The first parameter is the type is the termination condition of the calculation iteration, the second is the number of iterations, and the third is the set threshold. The default means that the iteration termination condition looks at the following two parameters, the maximum number of iterations is 30, and the minimum threshold is DBL_EPSILON.

The technical details of the calibration function are as such. If you have any insights, welcome to discuss, and I am currently researching precision vision measurement!

The reference is as follows:

/ ** @brief finds the internal and external parameters of the camera from multiple views of the calibration pattern.

@param objectPoints In the new interface, it is the vector of calibration mode point vectors.
Calibration mode coordinate space (for example std::vector <std::vector <cv::Vec3f >>). The outer
vector contains as many elements as there are schema views. If the calibration mode is the same
displayed in each view and it is fully visible, all vectors will be the same. Nonetheless,
partially occluded patterns can be used, or even different patterns in different views. Then the
vector will be different. These points are 3D, but because they are in the pattern coordinate system
, if the drilling machine is flat, then it makes sense to put the model on the XY coordinate plane.
The Z coordinate of each input object point is 0.
In the old interface, all vectors of object points from different views are connected
together.
@param imagePoints In the new interface, it is the vector
mode point of the calibration projection vector (eg std::vector <std::vector <cv::Vec2f >>). imagePoints.size() and
objectPoints.size() and imagePoints[i].size() must be equal to objectPoints[i].size() for each i.
In the old interface, all vectors of object points from different views are connected
together.
@param imageSize is only used to initialize the image size of the intrinsic camera matrix.
@param cameraMatrix output 3x3 floating point camera matrix
\ f $ A = \ vecthreethree {f_x} {0} {c_x} {0} {f_y} {c_y} {0} {0} {1} \ f $. If CV \ _CALIB \ _USE \ _INTRINSIC \ _GUESS
and/or CV_CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be
initialized before calling the function.
@param distCoeffs output vector of distortion coefficients
\ f $(k_1, k_2, p_1, p_2 [, k_3 [, k_4, k_5, k_6 [, s_1, s_2, s_3, s_4 [, \ tau_x, \ tau_y]]]]) \ f $ of
4, 5, 8, 12 or 14 elements.
@param rvecs is the output vector of the estimated rotation vector for each mode view (see Rodrigues)
(eg std :: vector <cv :: Mat >>). That is to say, each kth rotation vector and the corresponding
kth translation vector (see the next output parameter description) bring the calibration mode
from the model coordinate space (the object point is specified) to the world coordinate
space, that is, the kth The actual position of the calibration pattern in each mode view (k = 0 .. * M * -1).
@param tvecs is the output vector of the translation vector estimated for each mode view.
@param stdDeviationsIntrinsics is the output vector of the standard deviation estimated by the intrinsic parameters.
 The order of deviation values:
\ f $(f_x, f_y, c_x, c_y, k_1, k_2, p_1, p_2, k_3, k_4, k_5, k_6, s_1, s_2, s_3,
 s_4, \tau_x, \tau_y) \f$ If one of the parameters is not estimated, its deviation is equal to zero.
@param stdDeviationsExtrinsics The output vector of the standard deviation of the external parameter estimates.
 The order of deviation values: \f$(R_1, T_1, \dotsc, R_M, T_M)\f$ where M is the number of mode views,
 \f$R_i, T_i\f$ connect 1x3 vectors.
 @param perViewErrors is the output vector of the estimated RMS reprojection error for each mode view.
@param flags may be zero different flags or a combination of the following values:
 -** CV_CALIB_USE_INTRINSIC_GUESS ** cameraMatrix contains valid initial values
fx, fy, cx, cy for further optimization. Otherwise, (cx, cy) is initially set to the image
center (using imageSize), and the focal length is calculated in the least squares method.
Note that if the internal parameters are known, there is no need to use this function only to
estimate the external parameters. Please use solvePnP instead.
 -** CV_CALIB_FIX_PRINCIPAL_POINT ** The main point has not changed during the global period
. Optimization. It stays in the center or a different location specified,
CV_CALIB_USE_INTRINSIC_GUESS is also set.
 -** CV_CALIB_FIX_ASPECT_RATIO ** function only considers fy as a free parameter. The
ratio fx/fy remains the same as the input cameraMatrix. When
CV_CALIB_USE_INTRINSIC_GUESS is not set, the actual input values ​​of fx and fy are
ignored, and only their ratio is calculated and used further.
 -** CV_CALIB_ZERO_TANGENT_DIST ** Set the tangential distortion coefficient \ f $ (p_1, p_2) \ f $ to
zero and keep it at zero.
 -** CV_CALIB_FIX_K1,..., CV_CALIB_FIX_K6 **Corresponding radial distortion
coefficients remain unchanged during optimization. If CV_CALIB_USE_INTRINSIC_GUESS is
set, the coefficients of the provided distCoeffs matrix are used. Otherwise, it is set to 0.
 -** CV_CALIB_RATIONAL_MODEL ** Enable coefficients k4, k5 and k6. To provide
backward compatibility, this additional flag should be clearly specified to make the
calibration function using a rational model and returning 8 coefficients. If the flag is not
set, the function only calculates and returns 5 distortion coefficients.
 -** CALIB_THIN_PRISM_MODEL ** Enable coefficients s1, s2, s3 and s4.

In order to provide backward compatibility, this additional flag should be clearly specified to make the
calibration function using a thin prism model and return 12 coefficients. If the flag is not
set, the function only calculates and returns 5 distortion coefficients.
 -** CALIB_FIX_S1_S2_S3_S4 **Thin prism distortion coefficient will not change during this
optimization period . If CV_CALIB_USE_INTRINSIC_GUESS is set, the coefficients come from the
provided distCoeffs matrix. Otherwise, it is set to 0.
 -** CALIB_TILTED_MODEL ** The coefficients tauX and tauY are enabled. To provide
backward compatibility, this additional flag should be clearly specified to make the
calibration function using the tilt sensor model and return 14 coefficients. If the flag is not
set, the function only calculates and returns 5 distortion coefficients.
 -** CALIB_FIX_TAUX_TAUY ** The coefficients of the tilt sensor model will not change during this
optimization period . If CV_CALIB_USE_INTRINSIC_GUESS is set, the coefficients come from the
provided distCoeffs matrix. Otherwise, it is set to 0.
@param criteria Iterative optimization algorithm termination criteria.

This function estimates the conversion between two cameras making a stereo pair. If you have a stereo
camera, the relative position and direction of the two cameras are fixed. If you calculate
the object poses relative to the first camera and the second camera (R1, T1) and (R2, T2),
respectively (this can be Done with solvePnP), those poses must be related to each other.
This means that given (\ f $ R_1 \ f $, \ f $ T_1 \ f $), it should be possible to calculate (\ f $ R_2 \ f $, \ f $ T_2 \ f $). Only you
need to know the position and orientation of the second camera relative to the first camera. This is
the function of the described function. It calculates (\ f $ R \ f $, \ f $ T \ f $) in order to:

\ F [R_2 = R * R_1
T_2 = R * T_1 + T,\ f]

Optionally, it calculates the fundamental matrix E:

\ f [E = \ vecthreethree {0} { -  T_2} {T_1} {T_2} {0} { -  T_0} { -  T_1} {T_0} {0} * R \ f]

Among them, \f $T_i \f $ is a component of the translation vector\f $T \f$: \f $T = [T_0, T_1, T_2] ^ T \ f $ And the function
can also calculate the fundamental matrix F:

\ f [F = cameraMatrix2 ^ { -  T} E cameraMatrix1 ^ { -  1} \ f]

In addition to stereo related information, this function can also perform a complete calibration of each of the
two cameras. However, due to the high dimensions of the parameter space and the noisy
data in the input , this function can deviate from the correct solution. If the internal parameters can
be estimated with high accuracy for each camera individually (for example, using
calibrateCamera), it is recommended that you do so, and then pass the CV_CALIB_FIX_INTRINSIC flag to the
function and the calculated internal parameters. Otherwise, if all parameters are
estimated at once, it makes sense to limit some parameters, for example, through the
CV_CALIB_SAME_FOCAL_LENGTH and CV_CALIB_ZERO_TANGENT_DIST flags, which is usually a
reasonable assumption.

Similar to calibrateCamera, this feature minimizes all re-projection errors
pointing to all available views of the two cameras. The final value returned by this function is
reprojected incorrectly.
 * /

 

Guess you like

Origin blog.csdn.net/sun19890716/article/details/89372497