Visual SLAM fourteen talk talk (Second Edition) Twelfth notes

Chapter XII Construction Plan

Construction Plan (Mapping) is one slam of two major goals. Positioning discussed above are discussed locating feature value of a point, the positioning of the direct method, and the rear end of the optimization. In the classical model of slam, so-called map, namely the set of all signs point. Once the waypoint location, you can build a complete description of FIG. Just so, the visual odometer, Bundle Adjustment considered to complete the task, additionally optimized.

FIG construction but different needs, SLAM as an underlying technology, are often used to provide information to the upper application. Such as sweeping cleaning robot needs to be done in the hope that a path is calculated to cover the entire map. Or, if the top is an augmented reality device, the developer may be superimposed virtual objects in the real objects. In view visual slam, "build map" service and "positioning"; but in the application layer view, "build map" obviously also brings many other needs:

Positioning: Positioning is the basic function goes without saying. Visual odometry discuss how local map to achieve positioning. The loop detection portion, as long as the global descriptor information, can determine the position of the robot detected by the loop.

Navigation: in the map route planning, to find the path between any two maps, then control their movements to a target point. The process, in which at least know where the map can not pass, and which places can be. It belongs to the capabilities of sparse feature points of the map, at least it has a dense map.

Obstacle avoidance: avoidance is a problem often encountered robot, but it is more focused on local processing, dynamic navigation thereof. Similarly, only the feature point, a feature point can not determine whether an obstacle, it is necessary to dense map.

Reconstruction: dense map, use slam access to reconstruction effect the surrounding environment, and show it to other people to see. For example: three-dimensional video calls or online shopping. We can build a plane with texture, like video games as a three-dimensional scene.

Interaction: mainly refers to the interaction between people and map. For example: sci-fi movie which is really, in augmented reality, we will place the virtual object in the room, and there is some interaction between these virtual objects with - such as clicking a virtual web browser to watch the video, or to the wall throwing objects.

1. dense reconstruction monocular

. A dense reconstruction solutions:

. Monocular camera, a distance measurement pixel by triangulation of the camera after the movement.

Using a binocular camera, using left and right parallax calculation object pixel distance (the same multicast principle).

Using the RGB-D camera directly pixel distance.

But monocular and binocular depth access to often "thankless" - spend a lot of calculations, and finally get some very reliable depth estimation. Of course, RGB-D also has some range, and limit the application range of the light.

b Depth filter techniques: with increasing depth estimation can be measured at a very uncertain amount gradually converge to a stable value.

2. polar search block matching

Chapter XII Construction Plan

1. dense reconstruction monocular

. A dense reconstruction solutions:

. Monocular camera, a distance measurement pixel by triangulation of the camera after the movement.

Using a binocular camera, using left and right parallax calculation object pixel distance (the same multicast principle).

Using the RGB-D camera directly pixel distance.

b Depth filter techniques: with increasing depth estimation can be measured at a very uncertain amount gradually converge to a stable value.

2. polar search block matching

FIG described above is mainly the source line search. Go to the other end along a first electrode line of an image, one by degree of similarity of each pixel p1. Similar direct detection loop method, however, the luminance of a single pixel does not distinguish, then more blocks of pixels, p1 taken around a small block size of w * w, and electrode lines can take many small blocks of the same size Compare, can to some extent improve discriminative. Called block matching.

a. calculating the difference between the current tile and the tile has

SAD (Sum of Absolute Difference). Take two pieces of the sum of absolute difference.

SSD (Sum of Squared Distance). 2 taken squared differences and small.

NCC (Normalized Cross Correlation). Normalized correlation, the correlation is calculated in two pieces.

3. parallelization: efficiency problems

During polar search, calculated on a pixel next to a pixel is not necessarily linked. Then you can use multiple threads, were calculated for each pixel, and the result is unified.

4. RGB-D in FIG dense construction

RGB-D camera is obtained by measuring the sensor hardware, without consuming a large amount of computing resources is estimated. Further, when the structured light principle or the flying RGB-D, to ensure the independence of the depth of the texture data. Therefore, RGB-D for dense construction is relatively FIG.

a. Construction embodiment of FIG.

Figure densely built mainstream way. One of the most intuitive and simple way. The camera pose estimation, the RGB-D point cloud data into (Point CLoud), then splicing, and finally get a composed point by discrete point cloud map (Point Cloud Map).

b. octree map

Point Cloud has several significant drawbacks. One is often large scale, but in many cases, it's the "big" is not necessary, the number of unnecessary detail add to the mix. Second point cloud map can not handle the movement of objects, point cloud only "add points" and not "point disappear when removing it" approach. The actual environment will become less practical.

c. * TSDF Maps and Fusion Series

And slam similar but have slightly different research directions: real-time three-dimensional reconstruction. Map foregoing model, as the main body to locate, map splicing in subsequent processing steps as slam frame.

Three-dimensional reconstruction of the reconstruction accurate map as a major goal, so basically need to use GPU acceleration, typically require heavy computing device. In contrast, the development of SLAM to lightweight, miniaturization, and some even abandoned construction diagrams and loop detection portion, leaving only the visual odometer.

TSDF (cutoff distance function) A grid map, which is stored in memory instead of memory, the use of the parallel nature of GPU, may be calculated and updated in parallel for each voxel.