[Lane Line Detection] Hough Transform (HoughLines) to detect straight lines in detail

Summarize

The Hough transform is an idea for detecting any shape that can be expressed mathematically, even if the shape is broken or slightly distorted.

The principle of the Hough transform is to transform the points on a specific graph into a set of parameter spaces, and find a solution corresponding to the maximum value according to the cumulative results of the points in the parameter space, then this solution corresponds to the parameters of the geometric shape to be found ( For example, a straight line, then you will get the slope k and constant b of the straight line, and you will get the center and radius of the circle, etc.).

It is easy to think that we use k,b as the parameter space representation, then the points in the rectangular coordinate system become lines in the new space; the straight lines in the rectangular coordinate system become points in the new space.
Therefore, finding the straight line passing the most points in the Cartesian coordinate system is equivalent to finding the point in the new space where multiple straight lines intersect the most times. The k and b values ​​of this point are the k and b of the line in the Cartesian coordinate system we are looking for.

But in fact, it is not transformed into the space of k and b, but transformed into the space of polar coordinate system, represented by ρ and θ. The reason for this will be explained in detail below.

Detailed explanation

Convert to k, b as a parameter space representation (no way)

In the Cartesian plane (x and y axes), a line is defined by the formula y=kx+b, where x and y correspond to a particular point on the line, and k and b correspond to the slope and y-intercept, respectively.
insert image description here
The plane is plotted as a function of x and y values, which means we are showing how many (x, y) pairs the line consists of (there are infinitely many x, y pairs that make up any line, which is why the line extends to infinity cause).

However, straight lines can be drawn using its k and b values, and this transformation of space is called Hough space. In order to understand the Hough transform algorithm, we need to understand how Hough spaces work.

Hough Space Interpretation

In our use case, we can summarize the Hough space in two lines:

  • Points on the Cartesian plane become straight lines in Hough space
  • Lines on the Cartesian plane become points on the Hough space

Think about the concept of a line, a line is basically composed of infinitely long points arranged one after the other in an orderly manner. Because on the Cartesian plane, we draw lines that are functions of x and y, the lines appear to be infinitely long because there are infinitely many (x, y) pairs that make up the line.

Now in Hough space, we draw straight lines as a function of k and b values. Because there is only one k and b value on each Cartesian line, this line can be represented as a point.

For example, the equation y=2x+1 represents a straight line on the Cartesian plane. Its k and b values ​​are '2' and '1' respectively, which are the only possible k and b values ​​for this equation. On the other hand, this equation can have many values ​​of x and y such that this equation holds (left = right).
If we were to draw the equation with the values ​​of k and b, we would only use the point (2,1); if we were to draw the equation with the values ​​of x and y, we would have infinite choices because there are infinite Many (x, y) pairs.

Now we consider a point on the Cartesian plane. A point on the Cartesian plane has only one possible (x, y) pair that can represent it, so it is a point, not infinitely long. Regarding a point, there is also the fact that there are infinitely many possible lines that can pass through this point. In other words, this point can satisfy infinitely many kinds of k,b.
Currently, in the Cartesian plane, we plot this point according to the x and y values. But in Hough space, we draw the point according to its k and b values, and since there are infinite lines passing through this point, we get an infinitely long line in Hough space.
Taking point (3,4) as an example, the straight lines that can pass through this point are: y= -4x+16, y= -8/3x + 12 and y= -4/3x + 8 (there are infinitely many straight lines, but for For simplicity, we use 3 straight lines).

If you draw each line ([-4,16], [-8/3,12], [-4/3,8]) in Hough space, the point representing each line in Cartesian space will be Form a line in Hough space (this line corresponds to the point (3,4)).
insert image description here
Now what if we place another point on the Cartesian plane? What will be the result of this in Hough space? With Hough space, we can find the line on the Cartesian plane that best fits these two points.

We can do this by drawing straight lines in Hough space corresponding to two points in Cartesian space, and finding the point (POI, point of intersection) where these two lines intersect in Hough space.

To summarize the above:

  • Lines on the Cartesian plane are represented as points in Hough space
  • A point on the Cartesian plane is represented as a line in Hough space
  • By finding the m and b coordinates of POIs of two straight lines corresponding to these two points in Hough space, the best fitting straight line of two points in Cartesian space can be found, and then a straight line is formed according to these m and b values . ‍‍

But on the vertical line, the slope is infinite, and we cannot represent infinity in Hough space, which will crash the program. So instead of y=kx+b for the equation of a line, we use ρ and θ to define the line, which is also known as a polar coordinate system.

Convert to ρ, θ as a polar coordinate system representation of the parameter space

For the conversion between the rectangular coordinate system and the polar coordinate system, please refer to this blog

Taking straight line detection as an example, suppose there is a straight line L, the vertical distance from the origin to the line is ρ, and the angle between the vertical line and the x-axis is θ, then this line is unique, and the equation of the line is ρ=xcosθ+ysinθ , as shown in the figure below:
insert image description here
Similarly, 一条直线在极坐标系下只有一个(ρ,θ) 与之对应(similar to k, b, but with a different expression), if you change the size of any parameter of ρ or θ, the straight line transformed into the space domain will change.

All points on the L line must be on the line represented by the polar coordinates (ρ, θ), but they can also appear on other (ρ, line ((ρ1, θ1), (ρ2, θ2 )…), for example, the point (x, y) on the L line, it can be on many lines, to be precise, on the line passing through this point, draw two random lines as follows: We can see that θ is nothing more
insert image description here
than From 0-360 degrees (0−2π), assuming that we take a straight line every 1 degree and ensure that (x, y) is on this straight line, then (x, y) will appear on 360 straight lines. Now we put this What does the situation look like when drawn on the polar axis?
insert image description here
This figure reflects a point (x, y) in the space domain, and every possible straight line that can appear. Each red point represents a (ρ, θ) , which represents a straight line passing through the (x, y) point. According to the transformation step of θ, the number is different. If the step change value of θ is 1, there are 360 ​​red points. If the step size of θ changes A value of 10 gives 36 red dots.

So what if every point in the space domain is found in a circle and drawn in the polar coordinate system? That is, each point corresponds to a series of (ρ, θ) in the parameter space. How about drawing them in the same coordinate system now?

For convenience, assume that 3 points are drawn on this straight line:
insert image description here
in polar coordinates, each point in the space domain will have a periodic curve to represent the straight line passing through this point. It can be found that these three polar coordinate system curves pass through a point (ρ', θ') at the same time. Each point on the polar coordinates corresponds to a straight line on the spatial coordinates, which means that in the spatial coordinate system, there is a straight line that can pass through point 1, point 2, and point 3, which means that these three points are in a straight line On, this straight line is (ρ', θ'). Conversely, look at the curve in this polar coordinate system, then we only need to find the point with the most intersections, and return it to the space domain to be the straight line we are looking for. 一条直线上的所有点绘成的曲线交点势必是曲线相交次数最多的点.

It can be seen that the Hough transform is a parameter mapping transformation. Each point is mapped, and mapped more than once. (ρ, θ) has a step size. Take the step size of θ as an example. When the step size of θ is large, the mapped (ρ, θ) pairs are less, and vice versa. But we see that the mapped point pairs need to find the intersection point, and the above-drawn curve is continuous, but in fact, because of the existence of θ step size, it cannot be continuous, it is discrete ( 相当于这个点转几度取一条直线). When the θ step size is relatively large, it is impossible for you to have many intersection points, so the θ step size cannot be too large. In theory, the smaller the effect, the better, because the smaller it is, the closer it is to a continuous curve, and the The easier it is to intersect. But the smaller the problem is, the greater the amount of calculation. Assuming a 100 x 100 image (very small), there are 10,000 points, and each point is assumed to map 36 groups (theta step value is 10), Then a total of 360,000 mappings are required. Considering the calculation time of each mapping, it is conceivable that Hough’s calculation is time-consuming and labor-intensive. So it has to be improved. The first is to improve the image, 100 100 images, 10,000 points, is it necessary to calculate each point? It is not necessary, we only need to extract an edge from the image at the beginning , generally use the canny operator to generate a black and white binary image, the white is the edge, then when mapping, we only need to parameterize the points on the edge Space transformation is fine. Why extract edges? Think about the straight lines and circles that exist in the detection image, they must be clear. Then the points that need to be transformed may be reduced from 10,000 points to 1,000 points, which is why when many Hough transforms extract shapes, they extract the edges of the image and turn it into a binary image.

Then a Hough transform can be designed in the following steps in algorithm design:

(1) Quantize the parameter space (ρ, θ), assign an initial value to a two-dimensional matrix M, and M(ρ, θ)
is an accumulator. The rows of this two-dimensional array represent different ρ, and the columns represent θ; initially all values ​​are 0. The size of the array depends on the precision of the algorithm. Assuming the accuracy of the desired angle is accurate to 1 degree, then 360 columns are required. For ρ, the largest possible distance is the diagonal length of the image, so if one pixel accuracy is required, then the number of rows is the length of the diagonal of the image.
(2) Then transform each point on the edge of the image to which group (ρ, θ) it belongs to, and increase the value corresponding to the group (ρ, θ) by 1 (the point that needs to be transformed here is the
above Said image after edge extraction). (3) When all points are processed, analyze the obtained M(ρ,θ)
and set a threshold T. When M(ρ,θ)>T, it is considered that there is a straight line. And the corresponding (ρ, θ)
are the parameters of this group of straight lines. As for the value of T, you can formulate it yourself and try until it is more suitable. (4) With (ρ, θ) and the point (x, y), the straight line can be calculated.

code example

The functions that use Hough transform to detect straight lines in Opencv are cv2.HoughLines(), cv2.HoughLinesP().
The cv2.HoughLines() function has four inputs, the first is a binary image, which is the image after canny transformation, and the second and third parameters are the accuracy of ρ and θ, which are the step size of the two, which is determined by the step size specifies the size of the two-dimensional array of accumulators. The fourth parameter is the threshold T, and the value in the accumulator is higher than T, which is considered a straight line. The return value of the function is a numpy array with a shape of (N, 1, 2), which means that N (ρ, θ) are obtained based on the binary image. The unit of ρ is the pixel length (that is, the distance from the line to the origin (0,0) of the image), and the unit of θ is radians.

Here is a code example:

import cv2
import numpy as np
'''图像中的直线检测
'''
img = cv2.imread('img/computer.jpg')
 
gray_img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray_img,50,150,apertureSize=3)
minLineLength = 30
maxLineGap = 5
print(np.pi/180)
lines = cv2.HoughLines(edges,1,np.pi/180,150)
for line in lines:
    
    r,theta = line[0]
    # Stores the value of cos(theta) in a
    a = np.cos(theta)
    # Stores the value of sin(theta) in b
    b = np.sin(theta)
    # x0 stores the value rcos(theta)
    x0 = a*r
    # y0 stores the value rsin(theta)
    y0 = b*r
    # x1 stores the rounded off value of (rcos(theta)-1000sin(theta))
    x1 = int(x0 + 1000*(-b))
    # y1 stores the rounded off value of (rsin(theta)+1000cos(theta))
    y1 = int(y0 + 1000*(a))
    # x2 stores the rounded off value of (rcos(theta)+1000sin(theta))
    x2 = int(x0 - 1000*(-b))
    # y2 stores the rounded off value of (rsin(theta)-1000cos(theta))
    y2 = int(y0 - 1000*(a))
    # cv2.line draws a line in img from the point(x1,y1) to (x2,y2).
    # (0,0,255) denotes the colour of the line to be 
    #drawn. In this case, it is red. 
    cv2.line(img,(x1,y1), (x2,y2), (0,0,255),1)
 
 
cv2.imshow('edges',edges)
cv2.imshow('lines',img)
cv2.waitKey(-1)
cv2.destroyAllWindows()

Running results:
insert image description here
You can modify the fourth parameter of the cv2.HoughLines function, which is the threshold T, and you will get different numbers of straight line detection effects.

The function cv2.HoughLinesP() is a kind of probabilistic line detection. We know that the Hough transform is a time-consuming and labor-intensive algorithm, especially the calculation of each point. Even after the canny transformation, sometimes the number of points is still huge. At this time, we adopt a probability selection mechanism, not all points are calculated, but some points are randomly selected for calculation, which is equivalent to downsampling. In this case, our threshold setting should also be lowered. There are two more parameters in the parameter input: minLineLengh (the shortest length of the line, shorter than this is ignored) and MaxLineCap (the maximum interval between two straight lines, less than this value, it is considered a straight line). The output has also changed. It is no longer a straight line parameter. The output of this function is directly the coordinate position of the straight line point, which can save a series; the conversion from the parameter space to the actual coordinate point of the image in the for loop.

import cv2
import numpy as np
img = cv2.imread('img/computer.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize =3)
lines = cv2.HoughLinesP(edges,1,np.pi/180,160,minLineLength=100,maxLineGap=10)
for line in lines:
    x1,y1,x2,y2 = line[0]
    cv2.line(img,(x1,y1),(x2,y2),(0,0,255),1)
    
cv2.imshow('edges',edges)
cv2.imshow('lines',img)
cv2.waitKey(-1)
cv2.destroyAllWindows()

running result:
insert image description here

Reference for this article:
https://blog.csdn.net/piglite/article/details/118312270

Guess you like

Origin blog.csdn.net/All_In_gzx_cc/article/details/125545025