SGM (Semi-Global Matching) principle:
The principles of SGM are explained in more detail on the wiki encyclopedia and the matlab official website:
wiki matlab
If you want to fully understand the principles, it is recommended to read the original paper (I will not read it, I am lazy.) A
high-quality paper interpretation and code implementation The SGM algorithm github
implemented by the master himself in C++ first introduces two important parameters: Note: This part refers to the explanation of matlab, and the following part refers to the implementation of opencv. The details may be slightly different, but they are generally consistent. Disparity Levels and Number of Directions
Disparity Levels
Disparity Levels: Disparity levels is a parameter used to define the search space for matching. As shown in figure below, the algorithm searches for each pixel in the Left Image from among D pixels in the Right Image. The D values generated are D disparity levels for a pixel in Left Image. The first D columns of Left Image are unused because the corresponding pixels in Right Image are not available for comparison. In the figure, w represents the width of the image and h is the height of the image. For a given image resolution, increasing the disparity level reduces the minimum distance to detect depth. Increasing the disparity level also increases the computation load of the algorithm. At a given disparity level, increasing the image resolution increases the minimum distance to detect depth. Increasing the image resolution also increases the accuracy of depth estimation. The number of disparity levels are proportional to the input image resolution for detection of objects at the same depth. This example supports disparity levels from 8 to 128 (both values inclusive). The explanation of the algorithm refers to 64 disparity levels. The models provided in this example can accept input images of any resolution.——matlab
There are too many words to understand, so I asked gpt to explain:
# gpt生成,仅供本人理解SSD原理
import numpy as np
def compute_disparity(left_img, right_img, block_size=5, num_disparities=64):
# 图像尺寸
height, width = left_img.shape
# 初始化视差图
disparity_map = np.zeros_like(left_img)
# 遍历每个像素
for y in range(height):
for x in range(width):
# 定义搜索范围
min_x = max(0, x - num_disparities // 2)
max_x = min(width, x + num_disparities // 2)
# 提取左图像块
left_block = left_img[y:y+block_size, x:x+block_size]
# 初始化最小 SSD 和对应的视差
min_ssd = float('inf')
best_disparity = 0
# 在搜索范围内寻找最佳视差
for d in range(min_x, max_x):
# 提取右图像块
right_block = right_img[y:y+block_size, d:d+block_size]
# 计算 SSD
ssd = np.sum((left_block - right_block)**2)
# 更新最小 SSD 和对应的视差
if ssd < min_ssd:
min_ssd = ssd
best_disparity = abs(x - d)
# 将最佳视差保存到视差图中
disparity_map[y, x] = best_disparity
return disparity_map
# 示例用法
left_img = np.random.randint(0, 255, size=(100, 100), dtype=np.uint8)
right_img = np.roll(left_img, shift=5, axis=1) # 创建右图,右移了5个像素
disparity_map = compute_disparity(left_img, right_img, block_size=5, num_disparities=64)
# 可视化结果(这里简化为将视差图缩放以便可视化)
import matplotlib.pyplot as plt
plt.imshow(disparity_map, cmap='gray')
plt.title('Disparity Map')
plt.show()
This makes it clear that Disparity Levels is the range for calculating disparity (disparity search range).
Number of Directions
Number of Directions:
Number of Directions: In the SGBM algorithm, to optimize the cost function, the input image is considered from multiple directions. In general, accuracy of disparity result improves with increase in number of directions. This example analyzes five directions: left-to-right (A1), top-left-to-bottom-right (A2), top-to-bottom (A3), top-right-to-bottom-left (A4), and right-to-left (A5).
Matching pixels according to a single path is not robust enough, and the time complexity of global matching for two-dimensional optimal according to images is too high (NP-complete problem), so the author of SGM uses one-dimensional path aggregation to approximate the two-dimensional optimal.
pic reference
SAD and SSD
Use SAD or SSD to calculate image similarity for matching.
Formula:
Although the formula and code are generated by gpt, the formula looks correct. The code can help with understanding and is for reference only.
The num_disparities in the code are Disparity Levels
SGBM in opencv
I use opencv a lot, so here I only focus on the implementation of the code in opencv .
opencv StereoSGBM_create example:
# gpt生成,仅作为参考,具体请查看opencv官方文档https://docs.opencv.org/4.x/d2/d85/classcv_1_1StereoSGBM.html
import cv2
import numpy as np
# 读取左右视图
left_image = cv2.imread('left_image.png', cv2.IMREAD_GRAYSCALE)
right_image = cv2.imread('right_image.png', cv2.IMREAD_GRAYSCALE)
# 创建SGBM对象
sgbm = cv2.StereoSGBM_create(
minDisparity=0,
numDisparities=16, # 视差范围,一般为16的整数倍
blockSize=5, # 匹配块的大小,一般为奇数
P1=8 * 3 * 5 ** 2, # SGBM算法参数
P2=32 * 3 * 5 ** 2, # SGBM算法参数
disp12MaxDiff=1, # 左右视差图的最大差异
uniquenessRatio=10, # 匹配唯一性百分比
speckleWindowSize=100, # 过滤小连通区域的窗口大小
speckleRange=32 # 连通区域内的差异阈值
)
# 计算视差图
disparity_map = sgbm.compute(left_image, right_image)
# 将视差图进行归一化处理
disparity_map = cv2.normalize(disparity_map, None, 0, 255, cv2.NORM_MINMAX)
# 显示左图、右图和视差图
cv2.imshow('Left Image', left_image)
cv2.imshow('Right Image', right_image)
cv2.imshow('Disparity Map', disparity_map.astype(np.uint8))
cv2.waitKey(0)
cv2.destroyAllWindows()
Difference between SGBM and SGM
what is the difference between opencv sgbm and sgm
opencv官方的解释:
The class implements the modified H. Hirschmuller algorithm [82] that differs from the original one as follows:
- By default, the algorithm is single-pass, which means that you consider only 5 directions instead of 8. Set mode=StereoSGBM::MODE_HH in createStereoSGBM to run the full variant of the algorithm but beware that it may consume a lot of memory.
- The algorithm matches blocks, not individual pixels. Though, setting blockSize=1 reduces the blocks to single pixels.
- Mutual information cost function is not implemented. Instead, a simpler Birchfield-Tomasi sub-pixel metric from [15] is used. Though, the color images are supported as well.
Some pre- and post- processing steps from K. Konolige algorithm StereoBM are included, for example: pre-filtering (StereoBM::PREFILTER_XSOBEL type) and post-filtering (uniqueness check, quadratic interpolation and speckle filtering).
The general meaning is that the difference from SGM is that the smallest unit when matching the SGBM algorithm is blocks, not pixels, but when blockSize=1 is set, it becomes SGM. Mutual information is not implemented, but the simpler Birchfield-Tomasi sub-pixel metric is used. In addition, there are some pre- and post-processing operations.
That's probably it, I don't know if it's right.
Deep stereo matching algorithm
Make a hole first