Python implements Hough transform to detect rectangular ROI

Article directory

introduction

The Hough Transform (Hough Transform) was first proposed by Paul Hough in 1962, and then promoted by Richard Duda and Peter Hart in 1972. It is one of the basic methods for detecting geometric shapes from images in image processing. The classic Hough transform is used to detect straight lines in the image, and later the Hough transform can be extended to recognize objects of arbitrary shape, such as circles and ellipses.
The Hough transform uses the transformation between two coordinate spaces to map a curve or line with the same shape in one space to a point in another coordinate space to form a peak, thus transforming the problem of detecting arbitrary shapes into a statistical peak problem.
insert image description here

The principle of Hough transform to detect straight lines

The straight line determined by two points in the Cartesian coordinate system is y = kx+q. Considering the known points A and B, the unique k and q can be determined. If k and q are used
insert image description here
as independent variables and dependent variables, the Huo If the Cartesian coordinate system is used, then the straight line in the Cartesian coordinate system corresponds to a point in the Hough coordinate system:

On the contrary, consider a point (x1, y1) in the Cartesian coordinate system, and the equation of the line passing through this point is: q = -x1k+y1
At this time, the equation represents a straight line in Hough space:
insert image description here
if there are three collinear points, you can see that the collinear points in Cartesian coordinates intersect at one point in Hough space, because Cartesian coordinates The line under the system corresponds to a point in Hough space.

Because the special straight line x=c (perpendicular to the x-axis, the slope of the straight line is infinite) in the Cartesian coordinate space of the original image cannot be represented in the parameter space based on the Cartesian coordinate system. Therefore, in practical applications, the Hough space is represented by a polar coordinate system.
insert image description here
In polar coordinates, straight lines in Hough space will become curved lines. In the Cartesian coordinate system, a straight line is composed of countless points. Corresponding to the Hough space, countless straight lines or curves intersect at one point. Therefore, the parameters of the straight line can be determined by finding the maximum value in the Hough space.
insert image description here
As mentioned earlier, Hough line detection is to transform the line in the image space to the point in the parameter space, and solve the detection problem through statistical characteristics. Specifically, if the pixels in an image form a straight line, then the curves corresponding to these pixel coordinate values (x, y) in the parameter space must intersect at a point, so we only need to convert all the pixels in the image (coordinate value) into a curve in the parameter space, and the straight line can be determined by detecting the intersection of the curves in the parameter space.
　　
In theory, a point corresponds to an infinite number of straight lines or straight lines in any direction, but in practical applications, we must limit the number of straight lines (that is, a limited number of directions) to be able to calculate.
　　
Therefore, we discretize the direction θ of the straight line into a finite number of equally spaced discrete values, and the parameter ρ is correspondingly discretized into a finite number of values, so the parameter space is no longer continuous, but is discretized and quantized into equal sizes grid unit. After transforming the coordinate value of each pixel in the image space (rectangular coordinate system) to the parameter space (polar coordinate system), the obtained value will fall in a certain grid, and the accumulative counter of the grid unit will be increased by 1. After all the pixels in the image space have undergone the Hough transform, the grid unit is checked, and the coordinate value (ρ0, θ0) of the grid with the largest accumulated count value corresponds to the straight line sought in the image space.
insert image description here
The above is what the Hough line detection algorithm does. It detects the intersection of each pixel in the image between the corresponding curves in the parameter space. If the number of curves intersecting at one point exceeds the threshold, then this intersection can be considered (ρ, θ) corresponds to a straight line in image space.

Implementation based on opencv

Use the Hough transform of opencv to detect the straight line functions HoughLines and HoughLinesP. The related functions are described as follows:

cv2.HoughLines(
	image, 				# 原图像。
	rho, 				# 累加器的距离分辨率(以像素为单位)。
	theta, 				# 累加器的弧度角分辨率。
	threshold[, 		# 累加器阈值参数。只返回那些获得足够投票的直线(> threshold)。
	lines[, 			# 返回的线，格式为 [[rho_1,theta_1], [rho_2,theta_2], ...]
	srn[, 				# 对于多尺度 Hough 变换，它是距离分辨率 ρ 的一个除数。
						  粗的累加器距离分辨率为 ρ，精确的累加器分辨率为 ρ/srn。
						  如果同时使用 srn = 0 和 stn = 0，则使用经典的 Hough 变换。
						  否则，这两个参数都应该是正数。
	stn[, 				# 对于多尺度 Hough 变换，它是距离分辨率 θ 的除数。
	min_theta[,			# 对于标准和多尺度霍夫变换，最小角度检查线。必须在0和max_theta之间。
    max_theta			# 对于标准和多尺度霍夫变换，检查线路的最大角度。必须在 min_theta 和 cv2.CV_pi 之间。
    ]]]]]) -> lines

cv2.HoughLinesP(
	image, 				# 源图像。
	rho, 				# 累加器的距离分辨率(以像素为单位)。
	theta, 				# 累加器的弧度角分辨率。
	threshold[, 		# 累加器阈值参数。只返回那些获得足够投票的直线(> threshold)。
	lines[, 			# 返回线结果，格式为 [[xmin, ymin, xmax, ymax], ...]。
	minLineLength[, 	# 最短线段长度。
	maxLineGap]]]		# 点连接成线的最大距离。
	) -> lines

The code is implemented as follows:

edge = cv2.Canny(np.array(img, dtype='uint8'), 1743, 3400, apertureSize=7, L2gradient=True)
plt.imshow(edge)
plt.show()
# ============================使用opencv霍夫变换=============================
lines = cv2.HoughLines(edge, rho=1, theta=np.pi/180, threshold=round(800*scale_factor))
if lines is not None:
    for line in lines:
        rho, theta = line[0]
        a = np.cos(theta)
        b = np.sin(theta)
        x0 = a * rho
        y0 = b * rho
        x1 = int(x0 + 1000 * (-b))
        y1 = int(y0 + 1000 * (a))
        x2 = int(x0 - 1000 * (-b))
        y2 = int(y0 - 1000 * (a))
        cv2.line(img, (x1, y1), (x2, y2), (250, 20, 250), 2)

cv2.imshow("img", img)
cv2.imshow("edge", edge)
cv2.waitKey(0)
cv2.destroyAllWindows()

The effect is as follows:

insert image description here
It can be seen that for the above example, the function that comes with OpenCV can detect straight lines very well. However, when the image is more complex, the threshold needs to be controlled, otherwise many unexpected results will be obtained.

Principle of detecting ROI based on Hough transform

It can be seen from the above that the rectangular ROI of the target image can be detected by detecting a straight line through the Hough transform. However, it is not very convenient to use the Hough transform detection straight line function that comes with OpenCV. we know:

Long straight lines in Hough domain lead to high level of blobs
Parallel lines have the same angle, so their corresponding points appear on the Hough domain at the same value of θ.
The distance between the vertical lines is θ=π/2
Since there are two pairs of collimation lines that are almost parallel and perpendicular, two pairs of bright spots with almost the same value of θ are looked for in the Hough domain analysis, and the distance between them is about θ=π/2.
Accordingly, we can obtain four straight lines of the rectangular ROI, that is, the detection of the rectangular ROI is realized.

Hough transform detection rectangular ROI based on python

Algorithm steps:
(1) Initialize the accumulator H to all zeros;
(2) Traverse every edge point in the image;
(3) Find the point of the local maximum in H and the points separated by π/2;
(4) Set the point ( θ, ρ) into a straight line in the image.

# 测试，笛卡尔坐标系到霍夫域的转换
rho = []
rad = []
for theta in range(max_theta):
    # print(theta)
    theta = np.pi*theta/180
    rad.append(theta)
    rho1 = 100 * np.cos(theta) + 100 * np.sin(theta)
    rho.append(rho1)
plt.plot([0, x_data[-1]], [0, 0])
plt.plot(x_data, rho), plt.xlabel('theta'), plt.ylabel('rho')
plt.xlim([0, 180])
plt.show()

insert image description here
Convert Cartesian coordinate system (image coordinates) to Hough domain (parameter coordinates).

H = np.zeros([round(edge.shape[0]*3.2), max_theta])
points = []  # 边缘点集
for y in range(edge.shape[0]):
    for x in range(edge.shape[1]):
        if edge[y, x] > 0:
            p1 = [x, y]
            points.append(p1)

for p2 in points:
    # print(p2)
    for theta in range(max_theta):
        theta_rad = np.pi * theta / 180
        rho = p2[0] * np.cos(theta_rad) + p2[1] * np.sin(theta_rad)
        H[round(rho+H.shape[0]/2), round(theta)] += 1  # x,y，height的中点为坐标原点
# H[:, 0] = H[:, 0] + H[:, -1]

H = np.flipud(H)

hough_max = np.max(H, 0)
# plt.subplot(211), plt.imshow(H)
# plt.subplot(212), plt.plot(x_data, hough_max)
plt.imshow(H)
plt.show()

insert image description here
Find parallel and perpendicular lines in the Hough domain, totaling four points, and display them in the Hough image.

# 在霍夫域寻找平行和垂直线，四个点
theta_rho_max1 = np.argmax(hough_max)
theta_rho_max2 = theta_rho_max1 + 90
theta_rho_max2 = theta_rho_max2 % 180
print(theta_rho_max1)
print(theta_rho_max2)
rho_max1 = round(hough_max0[theta_rho_max1])  # 得到theta最大值的第一个rho
temp = np.where(H[:, theta_rho_max1] == rho_max1)
rho_max1_index = temp[0][0]
rho_max11_index = round(hough_max2(H[:, theta_rho_max1], rho_max1_index))
rho_max2 = round(hough_max0[theta_rho_max2])
temp = np.where(H[:, theta_rho_max2] == rho_max2)
rho_max2_index = temp[0][0]
rho_max22_index = round(hough_max2(H[:, theta_rho_max2], rho_max2_index))

# 在霍夫图像中标记处检测到的四个点
H[rho_max1_index, theta_rho_max1] = 100
H[rho_max11_index, theta_rho_max1] = 100
H[rho_max2_index, theta_rho_max2] = 100
H[rho_max22_index, theta_rho_max2] = 100
plt.imshow(H), plt.xlim([0, 180])
plt.show()

insert image description here

The complete code is as follows:

# -*- coding: UTF-8 -*-
"""
霍夫变换实现并将其用于检测ROI
"""
import numpy as np
import matplotlib.pyplot as plt
import cv2

img = cv2.imread('../images/hough_test2.jpg', -1)
scale_factor = 1.0/2.0
img = cv2.resize(img, None, fx=scale_factor, fy=scale_factor)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(np.shape(gray))

# ret, bw = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
edge = cv2.Canny(gray, 17430*scale_factor, 34000*scale_factor, apertureSize=7, L2gradient=True)

# cv2.imshow("", bw)
# cv2.waitKey()
# edge = cv2.Canny(gray, 17, 34, apertureSize=7, L2gradient=True)
# ============================使用opencv霍夫变换=============================
# lines = cv2.HoughLines(edge, rho=1, theta=np.pi/180, threshold=round(170*scale_factor))
# if lines is not None:
#     for line in lines:
#         rho, theta = line[0]
#         a = np.cos(theta)
#         b = np.sin(theta)
#         x0 = a * rho
#         y0 = b * rho
#         x1 = int(x0 + 1000 * (-b))
#         y1 = int(y0 + 1000 * (a))
#         x2 = int(x0 - 1000 * (-b))
#         y2 = int(y0 - 1000 * (a))
#         cv2.line(img, (x1, y1), (x2, y2), (0, 20, 250), 2)
# cv2.imshow("img", img)
# cv2.imshow("edge", edge)
# cv2.waitKey()
# cv2.destroyAllWindows()
# ============================自己实现霍夫变换=============================


def mean_filter1(data, fsize):
    l = np.shape(data)[0]
    dst = np.zeros([l, 1])
    hsize = round(fsize / 2)
    for i in range(l):
        left = i - hsize
        right = i + hsize
        if left < 0:
            left = 0
        if right > l-1:
            right = l-1
        dst[i] = np.mean(data[left:right], 0)
    return dst


def hough_max2(arr, m1=0):
    l = len(arr)
    m = 0
    max_index = 0
    for i in range(l):
        if arr[i] > m and i != m1:
            m = arr[i]
            max_index = i
    return max_index


max_theta = 181
x_data = np.arange(max_theta)

# 测试
rho = []
rad = []
for theta in range(max_theta):
    # print(theta)
    theta = np.pi*theta/180
    rad.append(theta)
    rho1 = 100 * np.cos(theta) + 100 * np.sin(theta)
    rho.append(rho1)
plt.plot([0, x_data[-1]], [0, 0])
plt.plot(x_data, rho), plt.xlabel('theta'), plt.ylabel('rho')
plt.xlim([0, 180])
plt.show()

H = np.zeros([round(edge.shape[0]*3.2), max_theta])
points = []  # 边缘点集
for y in range(edge.shape[0]):
    for x in range(edge.shape[1]):
        if edge[y, x] > 0:
            p1 = [x, y]
            points.append(p1)


for p2 in points:
    # print(p2)
    for theta in range(max_theta):
        theta_rad = np.pi * theta / 180
        rho = p2[0] * np.cos(theta_rad) + p2[1] * np.sin(theta_rad)
        H[round(rho+H.shape[0]/2), round(theta)] += 1  # x,y，height的中点为坐标原点
# H[:, 0] = H[:, 0] + H[:, -1]

H = np.flipud(H)

hough_max = np.max(H, 0)
# plt.subplot(211), plt.imshow(H)
# plt.subplot(212), plt.plot(x_data, hough_max)
plt.imshow(H)
plt.show()

hough_max0 = hough_max
hough_max = mean_filter1(hough_max, 10)
kernelSize = 10
hough_max_nms = np.zeros([hough_max.shape[0]])
for i in range(hough_max.shape[0]):
    temp = hough_max[i:i+kernelSize]
    max1 = np.max(temp)
    hough_max_nms[i] = max1

for i in range(hough_max.shape[0]):
    if hough_max[i] == hough_max_nms[i]:
        hough_max_nms[i] = hough_max[i]
    else:
        hough_max_nms[i] = 0

plt.plot(x_data, hough_max)
for i in range(hough_max_nms.shape[0]):
    if hough_max_nms[i] > 0:
        plt.scatter(x_data[i], hough_max_nms[i], c='r')
plt.xlabel('theta'), plt.ylabel('rho')
plt.show()

# 在霍夫域寻找平行和垂直线，四个点
theta_rho_max1 = np.argmax(hough_max)
theta_rho_max2 = theta_rho_max1 + 90
theta_rho_max2 = theta_rho_max2 % 180
print(theta_rho_max1)
print(theta_rho_max2)
rho_max1 = round(hough_max0[theta_rho_max1])  # 得到theta最大值的第一个rho
temp = np.where(H[:, theta_rho_max1] == rho_max1)
rho_max1_index = temp[0][0]
rho_max11_index = round(hough_max2(H[:, theta_rho_max1], rho_max1_index))
rho_max2 = round(hough_max0[theta_rho_max2])
temp = np.where(H[:, theta_rho_max2] == rho_max2)
rho_max2_index = temp[0][0]
rho_max22_index = round(hough_max2(H[:, theta_rho_max2], rho_max2_index))

# 在霍夫图像中标记处检测到的四个点
H[rho_max1_index, theta_rho_max1] = 100
H[rho_max11_index, theta_rho_max1] = 100
H[rho_max2_index, theta_rho_max2] = 100
H[rho_max22_index, theta_rho_max2] = 100
plt.imshow(H), plt.xlim([0, 180])
plt.show()

hp1 = [rho_max1_index, theta_rho_max1]
hp2 = [rho_max11_index, theta_rho_max1]
hp3 = [rho_max2_index, theta_rho_max2]
hp4 = [rho_max22_index, theta_rho_max2]
hough_points = [hp1, hp2, hp3, hp4]
print(hough_points)

for line in hough_points:
    rho = -(line[0] - H.shape[0]/2 + 1)
    theta = line[1]
    theta = np.pi * theta / 180
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a*rho  # 转换为笛卡尔坐标
    y0 = b*rho
    x1 = int(x0 + 1000*(-b))
    y1 = int(y0 + 1000*(a))
    x2 = int(x0 - 1000*(-b))
    y2 = int(y0 - 1000*(a))

    cv2.line(img, (x1, y1), (x2, y2), (250, 0, 0), 2)
cv2.imshow("edge", edge)
cv2.imshow("", img)
cv2.waitKey()
cv2.destroyAllWindows()

references

[1] Hough Transform (Hough Transform) for OpenCV image analysis
[2] Introduction and advantages and disadvantages of Hough Transform