Stereo correspondence algorithm

SGM (Semi-Global Matching) principle:

The principles of SGM are explained in more detail on the wiki encyclopedia and the matlab official website:
wiki matlab
If you want to fully understand the principles, it is recommended to read the original paper (I will not read it, I am lazy.) A
high-quality paper interpretation and code implementation The SGM algorithm github
implemented by the master himself in C++ first introduces two important parameters: Note: This part refers to the explanation of matlab, and the following part refers to the implementation of opencv. The details may be slightly different, but they are generally consistent. Disparity Levels and Number of Directions


Disparity Levels

Disparity Levels: Disparity levels is a parameter used to define the search space for matching. As shown in figure below, the algorithm searches for each pixel in the Left Image from among D pixels in the Right Image. The D values generated are D disparity levels for a pixel in Left Image. The first D columns of Left Image are unused because the corresponding pixels in Right Image are not available for comparison. In the figure, w represents the width of the image and h is the height of the image. For a given image resolution, increasing the disparity level reduces the minimum distance to detect depth. Increasing the disparity level also increases the computation load of the algorithm. At a given disparity level, increasing the image resolution increases the minimum distance to detect depth. Increasing the image resolution also increases the accuracy of depth estimation. The number of disparity levels are proportional to the input image resolution for detection of objects at the same depth. This example supports disparity levels from 8 to 128 (both values inclusive). The explanation of the algorithm refers to 64 disparity levels. The models provided in this example can accept input images of any resolution.——matlab

There are too many words to understand, so I asked gpt to explain:

# gpt生成,仅供本人理解SSD原理
import numpy as np

def compute_disparity(left_img, right_img, block_size=5, num_disparities=64):
    # 图像尺寸
    height, width = left_img.shape

    # 初始化视差图
    disparity_map = np.zeros_like(left_img)

    # 遍历每个像素
    for y in range(height):
        for x in range(width):
            # 定义搜索范围
            min_x = max(0, x - num_disparities // 2)
            max_x = min(width, x + num_disparities // 2)

            # 提取左图像块
            left_block = left_img[y:y+block_size, x:x+block_size]

            # 初始化最小 SSD 和对应的视差
            min_ssd = float('inf')
            best_disparity = 0

            # 在搜索范围内寻找最佳视差
            for d in range(min_x, max_x):
                # 提取右图像块
                right_block = right_img[y:y+block_size, d:d+block_size]

                # 计算 SSD
                ssd = np.sum((left_block - right_block)**2)

                # 更新最小 SSD 和对应的视差
                if ssd < min_ssd:
                    min_ssd = ssd
                    best_disparity = abs(x - d)

            # 将最佳视差保存到视差图中
            disparity_map[y, x] = best_disparity

    return disparity_map

# 示例用法
left_img = np.random.randint(0, 255, size=(100, 100), dtype=np.uint8)
right_img = np.roll(left_img, shift=5, axis=1)  # 创建右图,右移了5个像素

disparity_map = compute_disparity(left_img, right_img, block_size=5, num_disparities=64)

# 可视化结果(这里简化为将视差图缩放以便可视化)
import matplotlib.pyplot as plt
plt.imshow(disparity_map, cmap='gray')
plt.title('Disparity Map')
plt.show()

This makes it clear that Disparity Levels is the range for calculating disparity (disparity search range).

Number of Directions

Number of Directions:

Number of Directions: In the SGBM algorithm, to optimize the cost function, the input image is considered from multiple directions. In general, accuracy of disparity result improves with increase in number of directions. This example analyzes five directions: left-to-right (A1), top-left-to-bottom-right (A2), top-to-bottom (A3), top-right-to-bottom-left (A4), and right-to-left (A5).
Insert image description here

Matching pixels according to a single path is not robust enough, and the time complexity of global matching for two-dimensional optimal according to images is too high (NP-complete problem), so the author of SGM uses one-dimensional path aggregation to approximate the two-dimensional optimal.
Insert image description here
pic reference

SAD and SSD

Use SAD or SSD to calculate image similarity for matching.
Formula:
> Here is the quote
Although the formula and code are generated by gpt, the formula looks correct. The code can help with understanding and is for reference only.
The num_disparities in the code are Disparity Levels

SGBM in opencv

I use opencv a lot, so here I only focus on the implementation of the code in opencv .

opencv StereoSGBM_create example:

# gpt生成,仅作为参考,具体请查看opencv官方文档https://docs.opencv.org/4.x/d2/d85/classcv_1_1StereoSGBM.html
import cv2
import numpy as np

# 读取左右视图
left_image = cv2.imread('left_image.png', cv2.IMREAD_GRAYSCALE)
right_image = cv2.imread('right_image.png', cv2.IMREAD_GRAYSCALE)

# 创建SGBM对象
sgbm = cv2.StereoSGBM_create(
    minDisparity=0,
    numDisparities=16,  # 视差范围,一般为16的整数倍
    blockSize=5,        # 匹配块的大小,一般为奇数
    P1=8 * 3 * 5 ** 2,   # SGBM算法参数
    P2=32 * 3 * 5 ** 2,  # SGBM算法参数
    disp12MaxDiff=1,    # 左右视差图的最大差异
    uniquenessRatio=10,  # 匹配唯一性百分比
    speckleWindowSize=100,  # 过滤小连通区域的窗口大小
    speckleRange=32      # 连通区域内的差异阈值
)

# 计算视差图
disparity_map = sgbm.compute(left_image, right_image)

# 将视差图进行归一化处理
disparity_map = cv2.normalize(disparity_map, None, 0, 255, cv2.NORM_MINMAX)

# 显示左图、右图和视差图
cv2.imshow('Left Image', left_image)
cv2.imshow('Right Image', right_image)
cv2.imshow('Disparity Map', disparity_map.astype(np.uint8))

cv2.waitKey(0)
cv2.destroyAllWindows()

Difference between SGBM and SGM

what is the difference between opencv sgbm and sgm
opencv官方的解释:
The class implements the modified H. Hirschmuller algorithm [82] that differs from the original one as follows:

  1. By default, the algorithm is single-pass, which means that you consider only 5 directions instead of 8. Set mode=StereoSGBM::MODE_HH in createStereoSGBM to run the full variant of the algorithm but beware that it may consume a lot of memory.
  2. The algorithm matches blocks, not individual pixels. Though, setting blockSize=1 reduces the blocks to single pixels.
  3. Mutual information cost function is not implemented. Instead, a simpler Birchfield-Tomasi sub-pixel metric from [15] is used. Though, the color images are supported as well.
    Some pre- and post- processing steps from K. Konolige algorithm StereoBM are included, for example: pre-filtering (StereoBM::PREFILTER_XSOBEL type) and post-filtering (uniqueness check, quadratic interpolation and speckle filtering).

The general meaning is that the difference from SGM is that the smallest unit when matching the SGBM algorithm is blocks, not pixels, but when blockSize=1 is set, it becomes SGM. Mutual information is not implemented, but the simpler Birchfield-Tomasi sub-pixel metric is used. In addition, there are some pre- and post-processing operations.
Insert image description here
That's probably it, I don't know if it's right.

Deep stereo matching algorithm

Make a hole first

Guess you like

Origin blog.csdn.net/onlyyoujojo/article/details/135244558