ORB-SLAM2 source code analysis (monocular) - initialization process

ORB-SLAM2 source code analysis (monocular) - initialization process

1. Introduction to ORB-SLAM2

ORB-SLAM2 features:
insert image description here

ORB-SLAM2 overall framework:
insert image description here

(1) Input:
There are three modes to choose from: monocular mode, binocular mode and RGB-D mode.
(2) Tracking:
After the initialization is successful, it will first select the reference key frame tracking, and then most of the time is the constant speed model tracking. When the tracking is lost, the relocation tracking is started. After the above tracking, the initial pose can be estimated, and then The pose is further optimized by local map tracking. At the same time, it will judge whether the current frame needs to be created as a key frame according to the conditions.
(3) Local mapping :
The input keyframes come from the newly created keyframes in the tracking. In order to increase the number of local map points, feature matching between key frames in the local map will be performed again to generate new
(4) map points:
local BA will optimize the key frame pose and map points in the common view at the same time, and after optimization will also Remove inaccurate map points and redundant keyframes.
(5) Loop closure
Use the bag of words to query the data set to detect whether the loop is closed, and calculate the Sim3 pose between the current key frame and the closed loop candidate key frame. The scale is only considered in monocular mode, and the scale is fixed at 1 in binocular or RGBD mode. Then perform closed-loop fusion and essence map optimization to make all keyframe poses more accurate.
(6) Global BA
optimizes all key frames and their map points.
(7) Location recognition
It is necessary to import the dictionary trained offline, which is constructed by the visual word bag model. Newly input image frames need to be converted into word bag vectors online first, which are mainly used in feature matching, relocation, and loop closure.
(8) Map
The map is mainly composed of map points and key frames. The key frames form a common view according to the number of common view points, and form a spanning tree according to the parent-child relationship.

2. SLAM system initialization

Download the source code of ORB-SLAM2 from the Internet, and open the monocular camera demo.
(1) Enter the main function:
insert image description here
load the video file, which is used to run the entire ORB-SLAM2 process
(2) Slam system constructor:
insert image description here

(3) Load ORB dictionary
insert image description here

Load the ORB dictionary for our subsequent use of word bag matching
(4) to initialize the tracking thread
insert image description here

(5) Initialize local mapping and run the local mapping thread
insert image description here

(6) Initialize the loopback detection and run the loopback detection thread
insert image description here

(7) Initialize the renderer **
insert image description here

3. Track thread initialization

insert image description here

Let's go in and take a look at the specific operations of the tracking thread initialization
(1) Read the internal parameters of the camera from the configuration file
insert image description here

(2) Get the parameters of the image pyramid and feature points
insert image description here

(3) Initialize the ORB feature point extractor
insert image description here

See how the ORB feature extractor is initialized

1) Initialize the image size of each layer
Calculate the zoom ratio of each layer of the pyramid image to the original image through the number of pyramid layers and the scaling factor we input
insert image description here

2) Initialize the number of feature points extracted from each layer of pyramid image.
Calculate the number of feature points extracted from each layer of pyramid image through the total number of feature points to be extracted. The larger the image size, the more feature points are extracted, and vice versa
insert image description here

3) Initialize the relevant parameters of the gray-scale centroid of the ORB feature point. The
original FAST key point has no direction information, so when the image is rotated, the brief descriptor will also change, making the feature point not robust to rotation, so we use gray The centroid method is used to obtain the orientation of feature points, so as to obtain rotation invariance.
How to calculate the grayscale centroid?
insert image description here

Calculate grayscale centroid within a circle
insert image description here

Why use a circle to calculate the grayscale centroid instead of other shapes such as a square?
In ORBSLAM, the coordinates are first rotated and then the points are extracted from the image. It is not that the image is taken first and then rotated, which will cause the green and yellow parts to be different pixels when the points are collected below.
insert image description here

ORB-SLAM2 initializes the centroid-related parameters as shown in the following code
insert image description here

We need to calculate the gray-scale centroid of the feature points. According to the formula, we need to obtain the gray-scale values ​​of all pixels in the circle to calculate the gray-scale centroid. Therefore, we must first determine the boundary of the circle, and determine which pixels are involved in the calculation within the circle and which are not involved in the calculation. Here we first calculate the boundary of the quarter circle, then through symmetry, and finally form a circle. Finally, we obtain the boundary parameter information umax of the circle.

4. Local mapping initialization

insert image description here

insert image description here

5. Initialization of loopback detection

insert image description here
insert image description here

Set the number of loopback frames that need to be detected continuously in the loopback detection to count the loopback.

Guess you like

Origin blog.csdn.net/weixin_43391596/article/details/129692085