1 Introduction
Today we will focus on the Hough transform, which is a very classic algorithm for line detection by mapping points in an image to lines in parameter space. The Hough transform can detect lines in any orientation and works well in images with a lot of noise.
Without further ado, let's get started!
2. Basic knowledge
In order to understand how the Hough Transform works, first we need to understand how straight lines are defined in polar coordinates. A straight line is described by ρ (the vertical distance from the origin) and θ (the angle between the vertical line and the axis), as shown in the figure below: Therefore, the equation of the straight line is: We can
transform
it into the following expression form to obtain the following formula:
from In the above equation, we can see that all points with the same value of ρ and θ form a straight line. The basis of our algorithm is to compute the value of ρ for each point in the image for all possible values of θ.
3. Algorithm principle
The processing steps of Hough transform are as follows:
1) First we create a parameter space (also called Hough space). The parameter space is a two-dimensional matrix of ρ and θ, where θ ranges from 0–180.
2) Run the algorithm after detecting the edges of the image using an edge detection algorithm such as Canny edge. Pixels with a value of 255 are considered edges.
3) Next we scan the image pixel by pixel to find these edge pixels and calculate ρ for each pixel by using theta values from 0 to 180. For pixels on the same line, the values of θ and rho will be the same. We vote on it with a weight of 1 in Hough space.
4) Finally, values of ρ and θ whose votes exceed a certain threshold are considered as straight lines.
The code processing is as follows:
def hough(img):
# Create a parameter space
# Here we use a dictionary
H=dict()
# We check for pixels in image which have value more than 0(not black)
co=np.where(img>0)
co=np.array(co).T
for point in co:
for t in range(180):
# Compute rho for theta 0-180
d=point[0]*np.sin(np.deg2rad(t))+point[1]*np.cos(np.deg2rad(t))
d=int(d)
# Compare with the extreme cases for image
if d<int(np.ceil(np.sqrt(np.square(img.shape[0]) + np.square(img.shape[1])))):
if (d,t) in H:
# Upvote
H[(d,t)] += 1
else:
# Create a new vote
H[(d,t)] = 1
return H
4. Algorithm application
In this article, we will detect the corners of an object (book) in an image. This may seem like a simple task, however, it will give us insight into the process of detecting straight lines using the Hough Transform.
4.1 Color map to HSV space
Since this task is slightly difficult to do with a direct RGB image, we might as well convert that image to HSV color space so that we can easily get our target in the HSV range.
The core code is as follows:
img = cv2.imread("book.jpeg")
scale_percent = 30 # percent of original size
width = int(img.shape[1] * scale_percent / 100)
height = int(img.shape[0] * scale_percent / 100)
dim = (width, height)
# resize image
img = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
The results obtained are as follows:
4.2 Gaussian Blur
We apply Gaussian blur to smooth the rough edges in the image due to noise, and then we can highlight the target in our image, the code is as follows:
# Apply gaussian blur to he mask
blur = cv2.GaussianBlur(hsv, (9, 9), 3)
The result looks like this:
4.3 Binarization and erosion operations
Then we use inRange
the function to get the binarized image. This allows us to get rid of other surrounding objects in the image. code show as below:
# Define the color range for the ball (in HSV format)
lower_color = np.array([0, 0, 24],np.uint8)
upper_color = np.array([179, 255, 73],np.uint8)
# Define the kernel size for the morphological operations
kernel_size = 7
# Create a mask for the ball color using cv2.inRange()
mask = cv2.inRange(blur, lower_color, upper_color)
The results are as follows:
We observe the above picture, there are more or less gaps, we might as well use the corrosion operation to fill these gaps. code show as below:
# Apply morphological operations to the mask to fill in gaps
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (kernel_size, kernel_size))
mask = cv2.dilate(mask, kernel,iterations=1)
The result is as follows:
4.4 Edge detection
Edge detection canny
is mainly used to detect edges. This is mainly because of the high contrast between the target and the surrounding background. code show as below:
# Use canny edges to get the edges of the image mask
edges = cv2.Canny(mask,200, 240, apertureSize=3)
The result is shown in the figure below:
4.5 Hough Transform
When we do canny
edge detection, we get a lot of edges. So when we run the Hough algorithm, these edges contribute many candidate lines for the same edge. To solve this problem, we cluster the neighboring values of ρ and θ in Hough space and average their values to get the sum of their upvotes. This results in the merging of lines depicting the same edge, the code is as follows:
# Get the hough space, sort and select to 20 values
hough_space = dict(sorted(hough(edges).items(), key=lambda item: item[1],reverse=True)[:20])
# Sort the hough space w.r.t rho and theta
sorted_hough_space_unfiltered = dict(sorted(hough_space.items()))
# Get the unique rhoand theta values
unique_=unique(sorted_hough_space_unfiltered)
# Sort according to value and get the top 4 lines
unique_=dict(sorted(unique_.items(), key=lambda item: item[1],reverse=True)[:4])
The result is as follows:
4.6 Calculating corner points
From the straight line obtained in Hough space, we can use linear algebra to corner solve it. This can find the intersection point of our two straight lines, which is the corner point of the book, the code is as follows:
# Create combinations of lines
line_combinations = list(combinations(unique_.items(), 2))
intersection=[]
filter_int=[]
for (key1, value1), (key2, value2) in line_combinations:
try:
# Solve point of intersection of two lines
intersection.append(intersection_point(key1[0],np.deg2rad(key1[1]), key2[0],np.deg2rad(key2[1])))
except:
print("Singular Matrix")
for x,y in intersection:
if x>0 and y>0:
# Get the valid cartesan co ordinates
cv2.circle(img, (x, y), 5, (0, 0, 0), -1)
cv2.putText(img, '{},{}'.format(x,y), (x-10, y), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255, 255, 255), 1)
filter_int.append([x,y])
The final output is shown in the figure below:
5. Summary
Although this algorithm is now integrated in a wide variety of image processing libraries, by implementing it ourselves, this article provides insight into the challenges and limitations faced when creating such a complex algorithm.
Hmm, have you lost your studies?
Code link: poke me