"Vision slam fourteen lectures" ch5 camera and image study notes (2) - practical part binocular vision code explanation and eigen path setting

In this article, analyze the ch5 practice part of "Visual slam Fourteen Lectures" - the stereoVision code, and make a study record.

1. Realize the effect

First show the final rendering of this part

In the end, two images will come out, the upper one is the normalized disparity image, and the lower one is the point cloud image.

1.1 Problems that arise

(1) The CMake part of the code provided by Gao Bo is incomplete, you can modify it according to the following code:

cmake_minimum_required(VERSION 2.8)
set(CMAKE_BUILD_TYPE "Release")
# 添加c++ 11标准支持
set(CMAKE_CXX_FLAGS "-std=c++11 -O2")
find_package(Pangolin REQUIRED)
find_package(OpenCV REQUIRED)
add_executable(stereoVision stereoVision.cpp)
include_directories("/usr/local/include/eigen3")  #根据自己的eigen所在位置填写

target_link_libraries(stereoVision ${OpenCV_LIBS} ${Pangolin_LIBRARIES})

(2) In addition to this I have the following problem ( display core to dump ):

terminate called after throwing an instance of 'cv::Exception'

what(): OpenCV(3.4.15) /home/rxz/opencv3/modules/imgproc/src/median_blur.dispatch.cpp:283: error: (-215:Assertion failed) !_src0.empty() in function 'medianBlur'

Cause of the problem: Because the image corresponding to the image path in the source code cannot be found.

Solution: Modify the image path (you can know the path of the image by viewing the image properties)


// 文件路径
string left_file = "/home/rxz/slambook2/ch5/stereo/left.png";
string right_file = "/home/rxz/slambook2/ch5/stereo/right.png";

2. Code Analysis

#include <opencv2/opencv.hpp>
#include <vector>
#include <string>
#include <Eigen/Core>
#include <pangolin/pangolin.h>
#include <unistd.h>

using namespace std;
using namespace Eigen;

// 文件路径
string left_file = "/home/rxz/slambook2/ch5/stereo/left.png";
string right_file = "/home/rxz/slambook2/ch5/stereo/right.png";

// 在pangolin中画图，已写好，无需调整
void showPointCloud(
    const vector<Vector4d, Eigen::aligned_allocator<Vector4d>> &pointcloud);

int main(int argc, char **argv) {

    // 内参
    double fx = 718.856, fy = 718.856, cx = 607.1928, cy = 185.2157;
    // 基线
    double b = 0.573;

    // 读取图像
    cv::Mat left = cv::imread(left_file, 0);
    cv::Mat right = cv::imread(right_file, 0);
    cv::Ptr<cv::StereoSGBM> sgbm = cv::StereoSGBM::create(
        0, 96, 9, 8 * 9 * 9, 32 * 9 * 9, 1, 63, 10, 100, 32);    // sgbm经典参数配置
    cv::Mat disparity_sgbm, disparity;
    sgbm->compute(left, right, disparity_sgbm);//输入前面两张图，第三个参数是输出
    disparity_sgbm.convertTo(disparity, CV_32F, 1.0 / 16.0f);  //将disparity_sgbm变成32F类型的disparity，这里的disparity才是视差图。  如果Mat类型数据的深度不满足上面的要求，则需要使用convertTo()函数来进行转换。convertTo()函数负责转换数据类型不同的Mat

    // 生成点云
    vector<Vector4d, Eigen::aligned_allocator<Vector4d>> pointcloud;

    // 如果你的机器慢，请把后面的v++和u++改成v+=2, u+=2
    //默认计算出的是左视差图
    for (int v = 0; v < left.rows; v++)
        for (int u = 0; u < left.cols; u++) {
            //排除不在视差范围里的像素点
            if (disparity.at<float>(v, u) <= 0.0 || disparity.at<float>(v, u) >= 96.0) continue;  //这里的96是视差搜索范围

            Vector4d point(0, 0, 0, left.at<uchar>(v, u) / 255.0); // 前三维为xyz,第四维为颜色，如果不是255，那么取整后就为0，那么对应图像的颜色就为黑色，反之为白色

            // 根据双目模型计算 point 的位置  根据双目模型恢复像素点的三维位置
            double x = (u - cx) / fx;
            double y = (v - cy) / fy;
            double depth = fx * b / (disparity.at<float>(v, u));   //计算深度 视差图转化为深度图
            //把三维位置赋值给点的前三个数
            point[0] = x * depth;
            point[1] = y * depth;
            point[2] = depth;

            pointcloud.push_back(point);
        }

    cv::imshow("disparity", disparity / 96.0);  //归一化后的视差图
    cv::waitKey(0);
    // 画出点云
    showPointCloud(pointcloud);
    //  cv::waitKey(0);
    //    cv::destroyAllWindows();
    return 0;
}

void showPointCloud(const vector<Vector4d, Eigen::aligned_allocator<Vector4d>> &pointcloud) {

    if (pointcloud.empty()) {
        cerr << "Point cloud is empty!" << endl;
        return;
    }
//创建一个Pangolin窗口
    pangolin::CreateWindowAndBind("Point Cloud Viewer", 1024, 768);
    glEnable(GL_DEPTH_TEST);//根据物体远近，实现遮挡效果
    glEnable(GL_BLEND);//使用颜色混合模型，让物体显示半透明效果
    glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

    pangolin::OpenGlRenderState s_cam(
        pangolin::ProjectionMatrix(1024, 768, 500, 500, 512, 389, 0.1, 1000),
        pangolin::ModelViewLookAt(0, -0.1, -1.8, 0, 0, 0, 0.0, -1.0, 0.0)
    );
//ProjectionMatrix()中各参数依次为图像宽度=1024、图像高度=768、fx=500、fy=500、cx=512、cy=389、最近距离=0.1和最远距离=1000
   //ModelViewLookAt()中各参数为相机位置，被观察点位置和相机哪个轴朝上
   //比如，ModelViewLookAt(0, -0.1, -1.8, 0, 0, 0, 0.0, -1.0, 0.0)表示相机在(0, -0.1, -1.8)位置处观看视点(0, 0, 0)，并设置相机XYZ轴正方向为（0，-1，0），即右上前
   //创建交互视图，显示上一帧图像内容
    pangolin::View &d_cam = pangolin::CreateDisplay()
        .SetBounds(0.0, 1.0, pangolin::Attach::Pix(175), 1.0, -1024.0f / 768.0f)
        .SetHandler(new pangolin::Handler3D(s_cam));
 //SetBounds()内的前4个参数分别表示交互视图的大小，均为相对值，范围在0.0至1.0之间
   //第1个参数表示bottom，即为视图最下面在整个窗口中的位置
   //第2个参数为top，即为视图最上面在整个窗口中的位置
   //第3个参数为left，即视图最左边在整个窗口中的位置
   //第4个参数为right，即为视图最右边在整个窗口中的位置
   //第5个参数为aspect，表示横纵比

    while (pangolin::ShouldQuit() == false) {//如果pangolin窗口没有关闭，则执行
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);//清空颜色和深度缓存，使得前后帧不会互相干扰

        d_cam.Activate(s_cam);//激活显示，并设置相机状态
        glClearColor(1.0f, 1.0f, 1.0f, 1.0f);//设置背景颜色为白色

        glPointSize(2);
        glBegin(GL_POINTS);//绘制点云
        for (auto &p: pointcloud) {
            glColor3f(p[3], p[3], p[3]);//设置颜色信息
            glVertex3d(p[0], p[1], p[2]);//设置位置信息
        }
        glEnd();
        pangolin::FinishFrame();//按照上面的设置执行渲染
        usleep(5000);   // sleep 5 ms
    }
    return;
}

references

3. Principle analysis and important function analysis

3.1 sgbm parameter explanation:

    cv::Ptr<cv::StereoSGBM> sgbm = cv::StereoSGBM::create(
        0, 96, 9, 8 * 9 * 9, 32 * 9 * 9, 1, 63, 10, 100, 32);    // sgbm经典参数配置

static Ptr cv::StereoSGBM::create (
int minDisparity = 0,    //minDisparity is the smallest possible disparity value. Normally, it is zero, but sometimes the rectification algorithm may change the image, so this parameter needs to be adjusted accordingly
int numDisparities = 96, //numDisparities disparity search range. This value is always greater than zero. In the current implementation, this parameter must be divisible by 16
int blockSize = 9,     //BLOCKSIZE matching block size. It must be >= 1 An odd number. Normally, it should be in the range of 3...11
int P1 = 8 * 9 * 9,      //P1 controls the first parameter of parallax smoothness.
int P2 = 32 * 9 * 9,   //P2 The second parameter controls the disparity smoothness. The larger the value, the smoother the difference. P1 is the penalty for the disparity change between adjacent pixels plus or minus 1. P2 is the penalty for the disparity change between adjacent pixels by more than 1. The Algorithm requires P2 > P1.
int disp12MaxDiff = 1, //Maximum difference allowed in left and right disparity checks (in integer pixels). Set this to a non-positive value to disable checks.
int preFilterCap = 63, //Prefilter image Pixel cutoff value.
int uniquenessRatio = 10,
int speckleWindowSize = 100,//Maximum size of the smoothing parallax region to take into account its noise blobs and voids. Set it to 0 to disable speckle filtering. Otherwise, set it in the range of 50-200.
int speckleRange = 32, //The maximum parallax change within each connected component. If you do blob filtering, set the parameter to a positive value and it will be implicitly multiplied by 16. Usually, 1 or 2 is good enough.
)

3.2 convertTo function

The disparity map used to calculate the distance (CV_32F) and the disparity map used to see with the naked eye (CV_8U) use a different format, and the disparity map used for calculation does not need to be cropped and normalized , these are only for display readability and beautiful. Therefore, after computing the sgbm, the disparity_sgbm is obtained, and divided by 16 to obtain the disparity of the disparity used for calculation (divided by 16 because each pixel value is represented by a 16bit, and the lower 4 bits store the disparity value as a decimal part, so the true disparity value should be this value divided by 16).

usage:

The original image that needs to be converted.convertTo(dst, type, scale, shift)

dst: destination matrix;
type: the type of matrix to be output, or more specifically, the depth of the output matrix. If it is a negative value (usually -1), the output matrix is of the same type as the input matrix. Converting the bit depth is essentially to perform a linear transformation on the data at the original depth, so that the minimum and maximum values at the original bit depth correspond to the minimum and maximum values at the converted bit depth, respectively.
scale: range scale factor; eg. the original image is a grayscale image of 255, converted into float data and normalized to 0~1, then scale=1.0/255 shift: scale the input array elements proportionally and
add value;

3.3 Normalization processing

v::imshow("disparity", disparity / 96.0); //归一化后的视差图

If there is no normalization, the effect is as follows: