Implementation of OpenCV's 3D Reconstruction

Summary

This paper takes the computer vision 3D reconstruction technology as the research object, analyzes the 3D reconstruction model in the open computer vision function library OpenCV. Through six steps, especially the use of the epipolar constraint method in camera calibration and stereo matching, it is given based on OpenCV The three-dimensional reconstruction algorithm. The algorithm gives full play to the function library function of OpenCV, improves the accuracy and efficiency of calculation, has good cross-platform portability, and can meet the needs of various computer vision systems. Keywords computer vision; three-dimensional reconstruction; OpenCV; camera calibration; epipolar constraint

1 Introduction

Three-dimensional reconstruction technology is a hot spot and difficulty in cutting-edge fields such as computer vision, artificial intelligence, and virtual reality, and it is also one of the major challenges that humans face in basic research and applied research. Image-based 3D reconstruction is an important research branch of image processing. As the basis of today's popular virtual reality and scientific visualization, it is widely used in detection and observation. A complete 3D reconstruction system can usually be divided into six parts: image acquisition, camera calibration, feature point extraction, stereo matching, depth determination and post-processing [1] [3]. Among them, accurate calibration of camera internal and external parameters and stereo matching are the most important and difficult problems in 3D reconstruction. The open source computer vision library OpenCV (Open Source Computer Vision Library) was developed by Intel's research laboratory in Russia. It is a set of freely available libraries composed of some C functions and C classes. Realize some commonly used image processing and computer vision algorithms [2]. OpenCV is compatible with another image processing library IPL (Image Processing Library) developed by Intel. IPL is used to implement some low-level processing of digital images, while OpenCV is mainly used to perform some advanced processing on images, such as feature detection and Tracking, motion analysis, target segmentation and recognition, 3D reconstruction, etc. Since the source code of OpenCV is completely open, and the source code is written concisely and efficiently, especially most of the functions have been optimized by assembly, so that it can efficiently and fully utilize the design system of Intel series processing chips. For Pentium MMX, Pentium, Pentium Ⅲ and Pentium Ⅳ processors, OpenCV's code execution efficiency is very high, so in recent years, it has been widely used in foreign image processing related fields and has become a popular image Processing software. The camera calibration module in OpenCV provides users with a good interface, supports Windows and Linux platforms, effectively improves development efficiency, and has fast execution speed and good cross-platform portability, so it can be well applied to engineering practice among.

2 Basic principles of 3D reconstruction

2.1 Image acquisition
Stereoscopic image acquisition is the basis of stereo vision. There are many ways to acquire images, which mainly depend on the application occasion and purpose. The influence of factors such as viewpoint differences, lighting conditions, camera performance, and scenery characteristics must also be considered to facilitate stereo Calculation.

2.2 Camera calibration [4]
Camera calibration is to establish an imaging model, determine the location and attribute parameters of the camera, and determine the correspondence between an object point and its image point in the space coordinate system. Both cameras must be calibrated. If the camera is fixed, only one calibration is required when deriving 3-D information from the 2-D computer image coordinates. Camera calibration methods can be divided into two categories: the first category is to directly estimate the camera's position, optical axis direction, focal length and other parameters; the second category is to use least squares fitting to determine the mapping of three-dimensional space points to two-dimensional image points The transformation matrix. An effective camera model, in addition to accurately recovering the three-dimensional information of the spatial scene, is also conducive to solving the stereo matching problem.

2.3 Feature extraction
Determine the 3-D information from the parallax of multiple viewpoints. The key is to determine the corresponding relationship of the same object point in the scene in different images. One of the methods to solve this problem is to select appropriate image features and match them. Features are pixels or pixel sets or their abstract expressions. Commonly used matching features are mainly point-like features, linear features, and regional features. Generally speaking, large-scale features contain relatively rich information, and the number of them is small, so it is easy to obtain fast matching, but their extraction and description are relatively complicated, and the positioning accuracy is also poor; while the small-scale features themselves have high positioning accuracy and express description It is simple, but the number is large, and the amount of information contained is small. Therefore, it is necessary to adopt strong constraint criteria and matching strategies when matching. Reasonable selection of matching features is of great significance to stereo matching. Various factors should be considered comprehensively, and the selection should be made according to different scenery characteristics and application requirements. Generally, for scenes that contain a large number of irregular shapes and high abrupt changes, it is more suitable to extract point-like features, because it is difficult to extract features such as line segments and regions, and errors will be introduced: for a scene with a regular structure, if the line segments and regional features are The extraction and description are relatively easy and the error is small. The line segment features should be extracted to achieve fast matching.

2.4 Stereo matching [5]
Stereo matching is based on the calculation of selected features, establishes the correspondence between features, corresponds to the image points of the same spatial point in different images, and obtains the corresponding parallax image. Stereo matching It is the most important and most difficult problem in binocular vision. When a spatial three-dimensional scene is projected as a two-dimensional image, the images of the same scene under different viewpoints will be very different, and there are many factors in the scene, such as lighting conditions, scene geometry and physical characteristics, noise interference and distortion, and camera The characteristics, etc., are all integrated into a single image gray value. Therefore, it is very difficult to accurately unambiguously match images that contain so many unfavorable factors. Stereo matching methods are mainly divided into two categories, namely gray-scale correlation and feature matching. Gray-scale correlation is directly matched by pixel gray-scale. The advantage of this method is that the matching result is not affected by the accuracy and density of feature detection, and it can obtain high positioning accuracy and dense parallax surface; the disadvantage is that it depends on the statistical characteristics of image grayscale. , It is more sensitive to the surface structure of the scene and the reflection of light, so there are certain difficulties in the occasions where the surface of the space scene lacks sufficient texture details and the imaging distortion is large (such as the baseline length is too large). The advantage of the feature-based matching method is that it uses the features obtained from the intensity image as the matching primitive, so it is more stable when the ambient lighting changes; the disadvantage is that the feature extraction requires additional calculations, and because the features are discrete, it cannot After matching, a dense parallax field is directly obtained.
The matching method needs to solve the following problems: select the correct matching feature; find the essential attributes between the features; establish a stable algorithm that can correctly match the selected features.

2.5 Depth information determination
After the disparity image is obtained through stereo matching, the depth image can be determined and the 3-D information of the scene can be restored. The factors that affect the accuracy of distance measurement mainly include camera calibration error, digital quantization effect, feature detection and matching positioning accuracy, etc. Generally speaking, the distance measurement accuracy is directly proportional to the matching positioning accuracy, and inversely proportional to the camera baseline length. Increasing the length of the baseline can improve the depth measurement accuracy, but at the same time it will increase the difference between the images and increase the difficulty of matching. Therefore, to design an accurate stereo vision system, various factors must be considered comprehensively to ensure that each link has high accuracy.

2.6 Post-processing [6]
Post-processing includes depth interpolation, error correction and accuracy improvement. The ultimate goal of stereo vision is to restore the complete information of the visible surface of the scene. At present, no matter which matching method, it is impossible to restore the parallax of all image points. Therefore, for a complete stereo vision system, the final surface interpolation reconstruction is necessary. .

3 3D reconstruction based on OpenCV

The calibration method used in OpenCV [2] is a method between the traditional calibration method and the self-calibration method, which was proposed by Zhang Zhengyou in his paper [3]. This method does not need to know the specific information of the camera movement, which is more flexible than the traditional calibration technology. At the same time, it still needs a specific calibration object and a set of known feature primitive coordinates, which is not as flexible as self-calibration. . It calculates all the internal and external parameters of the camera by acquiring images of the calibration object in at least 3 different positions. Because it is more flexible than traditional calibration technology and can get good calibration accuracy, it was adopted by OpenCV. Three coordinate systems will be used in the calibration process of this model: image coordinate system, camera coordinate system and world coordinate system. Through the transformation between coordinate systems, the points of the image coordinate system can be converted to the world coordinate system by the following formula [7] [8]: Since matrix A contains all the 6 internal parameters of the camera, A is called the camera internal parameter matrix . PC is the external parameter matrix of the model, which can be obtained by the following formula:

Among them, is the rotation matrix and is the translation vector.

Camera calibration based on OpenCV uses a general checkerboard calibration template. First, use the function cvFindChessboardCorners() to roughly extract the corners of the chessboard, and then use the FindCornerSubPix() function to further accurately obtain the sub-pixel coordinate values ​​of the corners. Finally, substitute the coordinate values ​​into the cvCalibrateCamera2() function to get the internal and external parameter values ​​of the camera (the effect is shown in Figure 1). Figure 1 The effect of extracting corner points and displaying them (the chessboard is taken from OpenCV) OpenCV has Sobel, Laplace, Canny and other operators for edge detection. However, the Canny operator, that is, the cvCanny() function, is generally used for edge detection and feature extraction (see Figure 2). Figure 2 The comparison picture after Canny processing. The most difficult and most important part of the 3D reconstruction is the stereo matching part. The epipolar constraint method based on feature matching is selected in OpenCV [9]. Assuming a point P in space, its projection points on the imaging planes of the two cameras are P1 and P2, as shown in Figure 3. Among them, C1 and C2 are the centers of the two cameras, that is, the origin of the camera coordinate system. In epipolar geometry, we call the line connecting C1 and C2 the baseline. The intersection points e1 and e2 of the baseline and the imaging planes of the two cameras are the poles of the two cameras, respectively, which are the projection coordinates of the two camera centers C1 and C2 on the corresponding camera imaging planes. The triangular plane composed of P, C1 and C2 is called the polar plane π. The intersecting lines l1 and l2 between π and the imaging planes of the two cameras are called epipolar lines. Generally, l1 is the epipolar line corresponding to point P2, l2 is the epipolar line corresponding to point P1, and l1 and l2 correspond to each other. Figure 3 Let us take another point P'on the polar plane π. From the figure, we can see that its projection points on the two camera planes are P1 and P2', where P2 and P2' are both on the polar line l2. This is the epipolar constraint, that is, when a point P1 is given, its matching point must appear on its corresponding epipolar line. Therefore, our search space can be compressed to a one-dimensional straight line, that is, the epipolar line. In OpenCV, you can first use the function cvFindFundamentalMat() to find the basic matrix of the image, and then use the obtained basic matrix to substitute the function cvComputeCorrespondEpilines() to find the corresponding epilines of a point in one image in another image .

After the epipolar line is obtained, the pixel points along the epipolar line direction on the image are matched for gray-scale similarity, which can easily find the matching point of the point on the corresponding image.

4 Experimental results

Based on the above principles and OpenCV functions, a full set of 3D reconstruction system was developed using VC6.0. Through the above 6 steps, the figure of the object is finally restored. The program has been rigorously tested and runs stably. When performing camera calibration, it should be noted that the more pictures (at least 3), the more accurate the internal and external parameters calculated. Moreover, the optical axes of the cameras corresponding to any two images cannot be parallel. Figure 4 The left and right corresponding images used in the experiment Figure 5 The extracted contour Figure 6 The matching process (the white points in the figure are marked as a pair of corresponding points) Figure 7 The effect of point reconstruction (using OpenGL reconstruction)

5 Conclusion and outlook

As an important branch of computer vision, 3D reconstruction vision has always been one of the focuses and hotspots of computer vision research. It directly simulates the way human vision processes the scene, and can flexibly measure the three-dimensional information of the scene under a variety of conditions. The research on it is of great significance both from the perspective of visual physiology and from the perspective of engineering applications. The three-dimensional reconstruction vision technology has great advantages in obtaining the depth information of the object from the two-dimensional image of the object. The 3D reconstruction system developed by OpenCV in this paper has the advantages of simple calculation, accurate results, high operating efficiency, and can span multiple platforms. The system can be effectively used in various applications where computer vision can be used. This test system is suitable for the three-dimensional measurement of space objects with a small measurement range and less occlusion. For more severe occlusion, we need to increase the number of cameras to capture objects from more directions. Use the principle of binocular stereo vision to perform three-dimensional reconstruction.

references

[1] Park J S. Interactive 3D reconstruction from multiple images:a primitive-based approach [J]. Pattern Recognition Letters, 2005, 26(16): 2558-2571 [2] Intel Corporation. Open Source Computer Vision Library Reference Manual [S]. 2001-12 [3] Ma Songde, Zhang Zhengyou. Computer Vision-Computational Theory and Algorithm Foundation [M] Beijing: Science Press, 2003 [4] Mao Jianfei, Zou Xiyong, Zhu Jing. Improved two-step calibration of plane template Camera[J]. Journal of Image and Graphics, 2004, 9(7):846-852 [5] Xu Yi, Zhou Jun, Zhou Yuanhua. Stereo Vision Matching Technology. Computer Engineering and Applications, 2003,39(15):1- 5 [6] Pollefeys M, Koch R, Van Gool L. Self-calibration and Metric Reconstruction in Spite of Varying and Unknown Internal Camera Parameters[C]. Proc. of International Conference on Computer Vision, Bombay, India, 1998: 90 [ 7] Hartley RI, Zisserman A. Multiple View Geometry in Computer Vision [M]. Cambridge University Press, 2000 [8] Wu Fuchao, Li Hua, Hu Zhanyi. A New Camera Self-calibration Method Based on Active Vision System [J] .Chinese Journal of Computers, 2000, 23(11): 1130-1139 [9] Wen Gongjian, Wang Runsheng. A robust line extraction algorithm [J]. Journal of Software 2001, 12 (11): 1660-1666

Guess you like

Origin blog.csdn.net/m0_51233386/article/details/113487099