Binocular project actual combat---ranging (obtaining three-dimensional coordinates and depth information)

Table of contents

1 Introduction

2.Module explanation

2.1 Stereo Correction

2.1.1 Calibration purpose

2.1.2 Calibration method

2.2 Stereo matching and disparity calculation

2.3 Depth calculation

3. Complete code


1 Introduction

Binocular vision is a technology that uses two cameras (or two lenses) to capture the same scene at the same time, and then uses computer algorithms to obtain the depth and three-dimensional information of the scene. The two cameras can be placed in fixed positions or moved to obtain images from different angles to better understand the shape and structure of objects.

Binocular vision works by saying that when the same object is photographed at the distance between the two cameras (called the baseline), the position of the object is seen differently due to the parallax effect. Computers can measure this parallax to deduce object depth and three-dimensional information. This approach is similar to our binocular vision, where the two eyes cooperate through parallax, allowing us to perceive depth and distance.

Binocular vision technology can be applied in various fields, such as robots, driverless vehicles, virtual reality, etc. It can provide high-precision depth information and help achieve tasks such as obstacle detection, target recognition, and environment modeling. However, there are also some challenges in binocular vision, such as camera calibration, lens distortion, image matching, etc., which need to be overcome through algorithms.

Rough steps:

Dual target determination > Stereo correction (including distortion elimination) > Stereo matching > Parallax calculation > Depth calculation (3D coordinates) calculation

  1. Binary object calibration: Binary object calibration is the process of determining the internal parameters and relative positional relationship of two cameras (or lenses). It captures a set of images of special calibration plates or marking points under known world coordinates, combines the internal and external parameter models of the camera, and uses the correspondence between the image coordinates and the real world coordinates to solve for the camera's distortion parameters, internal parameter matrix and External parameter matrix and other parameters.

  2. Stereo correction (including distortion elimination): Stereo correction is to make the optical axes of the left and right cameras parallel and the images horizontally aligned. In this step, the viewing angles and positions of the two cameras are adjusted by processing the camera calibration results, such as rotation and translation calibration, so that their corresponding pixels are on the same horizontal line.

Stereo correction can also be used to eliminate the distortion effects of camera lenses. This kind of lens distortion can generally be corrected through the distortion parameters obtained during camera calibration to improve the accuracy of subsequent stereo matching and depth calculation.

  1. Stereo matching: Stereo matching refers to finding the corresponding pixel correspondence in the two corrected images on the left and right. Since the viewing angles of the left and right images are slightly different, the corresponding pixels will have a certain parallax (i.e. pixel displacement). The goal of stereo matching is to find these correspondences, which can be achieved through some feature point matching algorithms (such as SIFT, SURF, etc.) or region-based methods (such as block matching algorithms).

  2. Disparity calculation: Disparity calculation is to calculate the disparity value of each pixel through the correspondence between pixels obtained through stereo matching. The disparity value represents the pixel displacement between the left and right cameras and can be used to represent the distance information of the target object relative to the camera. Common disparity calculation algorithms include SSD (Sum of Squared Differences), SAD (Sum of Absolute Differences), NCC (Normalized Cross Correlation), etc.

  3. Depth calculation (3D coordinate calculation): Depth calculation is the conversion of disparity values ​​into actual distances in three-dimensional space. With the known baseline of the stereo camera (the distance between the two cameras) and camera intrinsic parameters, the disparity can be converted into the true depth of the object using the triangulation principle. In this way, the three-dimensional coordinate information of each pixel can be obtained, and a three-dimensional model of the entire scene can be constructed.

In the entire binocular vision processing process, the accuracy and stability of each step have an important impact on the final depth calculation result. Therefore, in practical applications, it is necessary to select appropriate algorithms and parameter settings, and conduct sufficient experiments and adjustments to obtain better binocular visual effects.

2.Module explanation

2.1 Stereo Correction

2.1.1 Calibration purpose

Stereo correction uses the internal and external parameters determined by the binoculars (focal length, imaging origin, distortion coefficient) and the relative position of the binoculars (rotation matrix and translation vector) to eliminate distortion and row alignment of the left and right views respectively, so that the imaging origin of the left and right views The coordinates are consistent, the optical axes of the two cameras are parallel, the left and right imaging planes are coplanar, and the epipolar lines are aligned.

The optical centers of the left and right cameras before correction are not parallel. The line connecting the two optical centers is called the baseline. The intersection of the image plane and the baseline is the pole. The straight line between the image point and the pole is the epipolar line. The left and right The plane formed by the epipolar line and the baseline is the epipolar plane corresponding to the space point.
After correction, the pole is at infinity and the optical axes of the two cameras are parallel. The height of the image points on the left and right images is consistent. This is the goal of epipolar correction. When performing subsequent stereo matching after correction, you only need to search for matching points on the left and right image planes on the same line, which greatly improves efficiency.

2.1.2 Calibration method

The experiment uses the stereoRectify() function in OpenCV to achieve stereo correction. The stereoRectify() function uses Bouguet's epipolar correction algorithm internally. The steps of Bouguet's algorithm are:
1. Change the right The rotation matrix of the image plane relative to the left image plane is decomposed into two matrices Rl and Rr, which are called the synthetic rotation matrices of the left and right cameras
2. Rotate the left and right cameras by half so that the optical axes of the left and right cameras parallel. At this time, the imaging planes of the left and right cameras are parallel, but the baseline and the imaging plane are not parallel
3. Construct the transformation matrix Rrect to make the baseline and the imaging plane parallel. The construction method is completed through the offset matrix T of the right camera relative to the left camera
4. Obtain the overall rotation matrix of the left and right cameras by multiplying the synthetic rotation matrix and the transformation matrix. Multiplying the left and right camera coordinate systems by their respective overall rotation matrices can make the main optical axes of the left and right cameras parallel, and the image plane parallel to the baseline
5. Through the above two overall rotation matrices, you can Obtain an ideal parallel configuration binocular stereoscopic image. After correction, crop the image as needed. You need to reselect an image center and image edge to maximize the left and right overlapping parts
The effect after correction:

 Related code

def getRectifyTransform(height, width, config):
    # 读取内参和外参
    left_K = config.cam_matrix_left
    right_K = config.cam_matrix_right
    left_distortion = config.distortion_l
    right_distortion = config.distortion_r
    R = config.R
    T = config.T

    # 计算校正变换
    R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(left_K, left_distortion, right_K, right_distortion,
                                                      (width, height), R, T, alpha=0)
    map1x, map1y = cv2.initUndistortRectifyMap(left_K, left_distortion, R1, P1, (width, height), cv2.CV_32FC1)
    map2x, map2y = cv2.initUndistortRectifyMap(right_K, right_distortion, R2, P2, (width, height), cv2.CV_32FC1)

    return map1x, map1y, map2x, map2y, Q


# 畸变校正和立体校正
def rectifyImage(image1, image2, map1x, map1y, map2x, map2y):
    rectifyed_img1 = cv2.remap(image1, map1x, map1y, cv2.INTER_AREA)
    rectifyed_img2 = cv2.remap(image2, map2x, map2y, cv2.INTER_AREA)

    return rectifyed_img1, rectifyed_img2

Note: Distortion elimination should be performed before stereoscopic correction.

2.2 Stereo matching and disparity calculation

Stereo matching is also called disparity estimation. Stereo matching can be divided into four steps: matching cost calculation, cost aggregation, disparity calculation and disparity optimization. After the stereocorrected left and right images are obtained, the matching points are on the same line, and the disparity map can be calculated using the BM algorithm or SGBM algorithm in OpenCV. Since the performance of the SGBM algorithm is far better than that of the BM algorithm, the SGBM algorithm is used to obtain the disparity map. After the disparity map is generated by stereo matching, the disparity map can be post-processed, such as filtering, hole filling, etc., to improve the visual effect of the disparity map.

Related code
 

def stereoMatchSGBM(left_image, right_image, down_scale=False):
    # SGBM匹配参数设置
    if left_image.ndim == 2:
        img_channels = 1
    else:
        img_channels = 3
    blockSize = 3
    paraml = {'minDisparity': 0,
              'numDisparities': 64,
              'blockSize': blockSize,
              'P1': 8 * img_channels * blockSize ** 2,
              'P2': 32 * img_channels * blockSize ** 2,
              'disp12MaxDiff': 1,
              'preFilterCap': 63,
              'uniquenessRatio': 15,
              'speckleWindowSize': 100,
              'speckleRange': 1,
              'mode': cv2.STEREO_SGBM_MODE_SGBM_3WAY
              }

    # 构建SGBM对象
    left_matcher = cv2.StereoSGBM_create(**paraml)
    paramr = paraml
    paramr['minDisparity'] = -paraml['numDisparities']
    right_matcher = cv2.StereoSGBM_create(**paramr)

    # 计算视差图
    size = (left_image.shape[1], left_image.shape[0])
    if down_scale == False:
        disparity_left = left_matcher.compute(left_image, right_image)
        disparity_right = right_matcher.compute(right_image, left_image)
    else:
        left_image_down = cv2.pyrDown(left_image)
        right_image_down = cv2.pyrDown(right_image)
        factor = left_image.shape[1] / left_image_down.shape[1]

        disparity_left_half = left_matcher.compute(left_image_down, right_image_down)
        disparity_right_half = right_matcher.compute(right_image_down, left_image_down)
        disparity_left = cv2.resize(disparity_left_half, size, interpolation=cv2.INTER_AREA)
        disparity_right = cv2.resize(disparity_right_half, size, interpolation=cv2.INTER_AREA)
        disparity_left = factor * disparity_left
        disparity_right = factor * disparity_right

    # 真实视差(因为SGBM算法得到的视差是×16的)
    trueDisp_left = disparity_left.astype(np.float32) / 16.
    trueDisp_right = disparity_right.astype(np.float32) / 16.

    return trueDisp_left, trueDisp_right

2.3 Depth calculation

After obtaining the disparity map, calculate the pixel depth value, the formula is as follows:
depth = ( f * baseline) / disp
Among them, depth represents the depth Figure; f represents the normalized focal length, which is fx in the internal parameter; baseline is the distance between the optical centers of the two cameras, called the baseline distance; disp is the disparity value
The experiment is direct Use the cv2.reprojectImageTo3D() function in opencv to calculate the depth map. The code is as follows

def getDepthMapWithQ(disparityMap: np.ndarray, Q: np.ndarray) -> np.ndarray:
    points_3d = cv2.reprojectImageTo3D(disparityMap, Q)
    depthMap = points_3d[:, :, 2]
    reset_index = np.where(np.logical_or(depthMap < 0.0, depthMap > 65535.0))
    depthMap[reset_index] = 0
    return depthMap.astype(np.float32)

3.Complete preparation

import sys
import cv2
import numpy as np
import stereoconfig


# 预处理
def preprocess(img1, img2):
    # 彩色图->灰度图
    if (img1.ndim == 3):  # 判断是否为三维数组
        img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)  # 通过OpenCV加载的图像通道顺序是BGR
    if (img2.ndim == 3):
        img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

    # 直方图均衡
    img1 = cv2.equalizeHist(img1)
    img2 = cv2.equalizeHist(img2)

    return img1, img2


# 消除畸变
def undistortion(image, camera_matrix, dist_coeff):
    undistortion_image = cv2.undistort(image, camera_matrix, dist_coeff)

    return undistortion_image


# 获取畸变校正和立体校正的映射变换矩阵、重投影矩阵
# @param:config是一个类,存储着双目标定的参数:config = stereoconfig.stereoCamera()
def getRectifyTransform(height, width, config):
    # 读取内参和外参
    left_K = config.cam_matrix_left
    right_K = config.cam_matrix_right
    left_distortion = config.distortion_l
    right_distortion = config.distortion_r
    R = config.R
    T = config.T

    # 计算校正变换
    R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(left_K, left_distortion, right_K, right_distortion,
                                                      (width, height), R, T, alpha=0)

    map1x, map1y = cv2.initUndistortRectifyMap(left_K, left_distortion, R1, P1, (width, height), cv2.CV_32FC1)
    map2x, map2y = cv2.initUndistortRectifyMap(right_K, right_distortion, R2, P2, (width, height), cv2.CV_32FC1)

    return map1x, map1y, map2x, map2y, Q


# 畸变校正和立体校正
def rectifyImage(image1, image2, map1x, map1y, map2x, map2y):
    rectifyed_img1 = cv2.remap(image1, map1x, map1y, cv2.INTER_AREA)
    rectifyed_img2 = cv2.remap(image2, map2x, map2y, cv2.INTER_AREA)

    return rectifyed_img1, rectifyed_img2


# 立体校正检验----画线
def draw_line(image1, image2):
    # 建立输出图像
    height = max(image1.shape[0], image2.shape[0])
    width = image1.shape[1] + image2.shape[1]

    output = np.zeros((height, width, 3), dtype=np.uint8)
    output[0:image1.shape[0], 0:image1.shape[1]] = image1
    output[0:image2.shape[0], image1.shape[1]:] = image2

    # 绘制等间距平行线
    line_interval = 50  # 直线间隔:50
    for k in range(height // line_interval):
        cv2.line(output, (0, line_interval * (k + 1)), (2 * width, line_interval * (k + 1)), (0, 255, 0), thickness=2,
                 lineType=cv2.LINE_AA)

    return output


# 视差计算
def stereoMatchSGBM(left_image, right_image, down_scale=False):
    # SGBM匹配参数设置
    if left_image.ndim == 2:
        img_channels = 1
    else:
        img_channels = 3
    blockSize = 3
    paraml = {'minDisparity': 0,
              'numDisparities': 64,
              'blockSize': blockSize,
              'P1': 8 * img_channels * blockSize ** 2,
              'P2': 32 * img_channels * blockSize ** 2,
              'disp12MaxDiff': 1,
              'preFilterCap': 63,
              'uniquenessRatio': 15,
              'speckleWindowSize': 100,
              'speckleRange': 1,
              'mode': cv2.STEREO_SGBM_MODE_SGBM_3WAY
              }

    # 构建SGBM对象
    left_matcher = cv2.StereoSGBM_create(**paraml)
    paramr = paraml
    paramr['minDisparity'] = -paraml['numDisparities']
    right_matcher = cv2.StereoSGBM_create(**paramr)

    # 计算视差图
    size = (left_image.shape[1], left_image.shape[0])
    if down_scale == False:
        disparity_left = left_matcher.compute(left_image, right_image)
        disparity_right = right_matcher.compute(right_image, left_image)
    else:
        left_image_down = cv2.pyrDown(left_image)
        right_image_down = cv2.pyrDown(right_image)
        factor = left_image.shape[1] / left_image_down.shape[1]

        disparity_left_half = left_matcher.compute(left_image_down, right_image_down)
        disparity_right_half = right_matcher.compute(right_image_down, left_image_down)
        disparity_left = cv2.resize(disparity_left_half, size, interpolation=cv2.INTER_AREA)
        disparity_right = cv2.resize(disparity_right_half, size, interpolation=cv2.INTER_AREA)
        disparity_left = factor * disparity_left
        disparity_right = factor * disparity_right

    # 真实视差(因为SGBM算法得到的视差是×16的)
    trueDisp_left = disparity_left.astype(np.float32) / 16.
    trueDisp_right = disparity_right.astype(np.float32) / 16.

    return trueDisp_left, trueDisp_right


def getDepthMapWithQ(disparityMap: np.ndarray, Q: np.ndarray) -> np.ndarray:
    points_3d = cv2.reprojectImageTo3D(disparityMap, Q)  # points_3d 是一个三维的数组,前面两个是宽和高,第三维是一个(x,y,z)的坐标
    points = points_3d[:, :, 0:3]

    depthMap = points_3d[:, :, 2]  # 索引三维数组的最后一维,就是深度信息
    reset_index = np.where(np.logical_or(depthMap < 0.0, depthMap > 65535.0))
    depthMap[reset_index] = 0

    return depthMap.astype(np.float32)


if __name__ == '__main__':
    # 读取图片
    cap = cv2.VideoCapture(0)
    cap.set(3, 1280)
    cap.set(4, 480)  # 打开并设置摄像头

    while True:
        ret, frame = cap.read()
        iml = frame[0:480, 0:640]
        imr = frame[0:480, 640:1280]  # 分割双目图像


        if (iml is None) or (imr is None):
            print("Error: Images are empty, please check your image's path!")
            sys.exit(0)
        height, width = iml.shape[0:2]    # 对图像进行切片操作,前面两位是高和宽

        iml_, imr_ = preprocess(iml, imr)  # 预处理,一般可以削弱光照不均的影响,不做也可以

        # 读取相机内参和外参
        # 使用之前先将标定得到的内外参数填写到stereoconfig.py中的StereoCamera类中
        config = stereoconfig.stereoCamera()
        # print(config.cam_matrix_left)

        # 立体校正
        map1x, map1y, map2x, map2y, Q = getRectifyTransform(height, width, config)  # 获取用于畸变校正和立体校正的映射矩阵以及用于计算像素空间坐标的重投影矩阵
        iml_rectified, imr_rectified = rectifyImage(iml, imr, map1x, map1y, map2x, map2y)
        # print(Q)

        # 绘制等间距平行线,检查立体校正的效果
        line = draw_line(iml_rectified, imr_rectified)
        cv2.imwrite('check_rectification.png', line)

        # 立体匹配
        disp, _ = stereoMatchSGBM(iml_rectified, imr_rectified, False)  # 这里传入的是经立体校正的图像

        cv2.imwrite('disaprity.png', disp * 4)

        # fx = config.cam_matrix_left[0, 0]
        # fy = fx
        # cx = config.cam_matrix_left[0, 2]
        # cy = config.cam_matrix_left[1, 2]

        # print(fx, fy, cx, cy)

        # 计算像素点的3D坐标(左相机坐标系下)
        points_3d = cv2.reprojectImageTo3D(disp, Q)  # 参数中的Q就是由getRectifyTransform()函数得到的重投影矩阵



        # 设置想要检测的像素点坐标(x,y)
        x = 120
        y = 360
        cv2.circle(iml, (x, y), 5, (0, 0, 255), -1)

        # x1 = points_3d[y, x]   # 索引 (y, x) 对应的是三维坐标 (x1, y1, z1)
        # print(x1)

        print('x:', points_3d[int(y), int(x), 0], 'y:', points_3d[int(y), int(x), 1], 'z:',
              points_3d[int(y), int(x), 2])  # 得出像素点的三维坐标,单位mm
        print('distance:', (points_3d[int(y), int(x), 0] ** 2 + points_3d[int(y), int(x), 1] ** 2 + points_3d[
            int(y), int(x), 2] ** 2) ** 0.5)  # 计算距离,单位mm


        cv2.namedWindow("disparity", 0)
        cv2.imshow("disparity", iml)
        # cv2.setMouseCallback("disparity", onMouse,  0)


        # 等待用户按键,如果按下 'q' 键或者 Esc 键,则退出循环
        c = cv2.waitKey(1) & 0xFF
        if c == 27 or c == ord('q'):
            break

    # 释放视频对象并关闭窗口
    cap.release()
    cv2.destroyAllWindows()

Guess you like

Origin blog.csdn.net/weixin_45303602/article/details/133869770
Recommended