Interpretation of the paper--Robust lane detection and tracking with Ransac and Kalman filter

Robust lane line tracking using random sampling consistency and Kalman filtering

Summary

        In a previous paper, we described a simple lane detection method using the Hough transform and iterative matched filters [1]. This paper extends this work by combining inverse perspective mapping to create a bird's-eye view of the road, applying random sample consensus to help eliminate outliers due to road noise and artifacts, and a Kalman filter to help smooth the lane tracker. output.

1 Introduction

        Driver safety on highways has been an area of ​​concern for many years. With the development of fast, cheap, low-power and sophisticated electronic products, cars with sensors, electronics and early warning systems began to appear on the market.

        One area of ​​interest for research and development is collision avoidance. Lane detection is an important component of effective collision avoidance. The ability to detect sudden or unexpected lane changes can help drivers avoid collisionswhen there is a glitch in the lane they are entering. Effectively monitoring a car's position within a lane can help avoid collisions caused by driver distraction, fatigue or driving under the influence of a controlled substance. There are obvious difficulties and challenges in the design of collision avoidance systems, some of which extend beyond the realm of engineering and involve complex issues related to law and liability.

        ​ ​ ​ In this article, we address designing a lane detection system. It is organized as follows. After a brief review of some previous research, we then describe the various components of the system. These methods include image preprocessing using temporal blurring, inverse projection mapping to create a bird's-eye view of the road, a Hough transform to detect candidate lane markings, a random sample consensus algorithm to help deal with outliers in the image, and Tracking lane parameters using Kalman filter. We then briefly describe the hardware used to collect the data and then demonstrate the performance of the lane tracking system. The results show a considerable improvement in performance compared to the [1] system that only uses Hough transform and matched filtering.

2. Previous research

        Many vision-based lane detection techniques have been developed in an attempt to detect lanes robustly. In the extraction of lane detection features, one of the most commonly used methods is to apply edge detectors to the data [2,3]. Using this approach, Canny edge detectors are often used to generate binary edge maps. From the binary edge map, the classic Hough transform is used to extract a set of lines as candidate lines for lane markings. Although this method generally shows good results, the detected lanes are often skewed due to surface irregularities or navigation text markings on the road. Extracting lane markings through color segmentation is another method commonly used in [4, 5]. Unfortunately, color segmentation is sensitive to ambient light and requires additional processing to avoid undesirable effects.

        Most methods for lane detection directly manipulate the image captured by the camera without any geometric correction or changing the camera perspective [1, 2, 4, 6]. While processing images from the camera's perspective allows access to raw data values, defining the properties of features of interest can be complex. For example, a forward-looking camera will capture an image without parallel lane markings, with line widths that vary with distance from the camera. These changes often require processing each row of the captured image differently.

        Many of the systems described above perform well under certain driving conditions, often requiring a specific assumption to be valid. Some of these assumptions include the presence of strong lane marking contrast and the absence of artifacts on the road, such as cracks, arrows, or similar markings. Unfortunately, these assumptions do not apply to many high-traffic urban streets and highways.

3.Method

        This paper extends the hierarchical lane detection method in [1] by (1) using inverse perspective mapping, (2) applying random sample consensus to help eliminate outliers, and (3) using the Kalman filter for prediction and smoothing. In the following sections, the individual components of the lane detection system will be described.

3.1. Image enhancement

        The captured color images undergo grayscale transformation and temporal blurring, with an average of N=3 consecutive frames. This smoothing helps connect the dotted lane markings to form a nearly continuous line [1].

3.2. Reverse perspective mapping

        The next step is to perform inverse perspective mapping (IPM) on the image. This transformation is used to change the captured image from a camera perspective to a bird's eye view as shown in Figure 1. [7, 8, 9]. With this transformation, lane detection now becomes a problem of detecting a pair of parallel lines, which are usually separated by a given fixed distance. Furthermore, this transformation enables mapping between pixels in the image plane to world coordinates (feet), as shown in Figure 1b. To ensure the accuracy of conversion, the camera's internal and external parameters are necessary.

Figure 1: Inverse perspective mapping converts a camera perspective image into a bird's-eye view image.

3.3. Lane candidate location detection

        Next, an adaptive threshold is applied to the IPM image to generate a binary image [1]. Each binary image is then divided into two parts, each part may contain a lane marking. A low-resolution Hough transform is then computed on the binary image to find the 10 highest-scoring lines for each half of the image [1]. Samples are then taken along the length of each line, as shown by the red plus signs in Figure 2. To find the approximate center of each line, a 1D matched filter is applied on each sample point of each line. As stated in [1], the matched filter is a Gaussian whose variance is a function of linewidth. Since the bird's-eye view created with IPM produces lines of approximately constant width, a fixed-variance Gaussian kernel can be used for the matched filter. After matching filtering, the pixel with the largest correlation coefficient exceeding the predetermined threshold at each sample point is selected as the best estimate of the lane marking center, as shown by the green plus sign in Figure 3. The minimum threshold helps ignore false positives, such as cracks, tar patches, or the absence of lane markings.

Figure 2: The green line indicates the high-scoring line obtained from the Hough transform, and the plus sign indicates the point at which each line will be sampled.

3.4. Elimination of outliers and data modeling

        Once the center of each candidate line at each sample point is estimated, Random Sampling Consistency (RANSAC) is applied to the data points. The general RANSAC algorithm robustly fits a model through the most likely dataset or embedder while rejecting outliers [10, 11]. Linear least squares estimation (LSE) is then used to fit a straight line over the integers. Figure 3 represents the parameterization of the fitting line using ρ and θ, where ρ is the distance from the origin (upper left corner pixel) to the straight line and θ is the angle shown in Figure 3 (generally close to 90◦).

Figure 3: Line fitted through a set of candidate points, parameterized by ρ and θ

3.5.Tracking

        The parameters of each line are predicted using the Kalman filter. The state vector x (n) and the observation vector y (n) are defined as

(1)

        Where, ρ and θ define the direction of the line, ˙ρ and θ˙ are the derivatives of ρ and θ, which are estimated using the difference between ρ and θ between the current frame and the previous frame. The state transition matrix A is

(2)

        The matrix C in the measurement equation is the identity matrix. The noise in the state and measurement equations is assumed to be white, and each process is assumed to be uncorrelated with the other processes. Therefore, the covariance matrices of these vector stochastic processes are constant and diagonal. The variance of each noise process is estimated offline using the frames where accurate lane estimates are being produced. In the case where lane markings are not detected, the matrix C is set to zero, forcing the Kalman filter to rely purely on prediction. Finally, the estimated lines are mapped back to the camera's view to depict the lane detection results.

4. Experimental analysis

4.1.Hardware

        The hardware used to test and evaluate this new lane detection system is built around an Intel-based computer. A forward-facing FireWire color camera is mounted below the rearview mirror so it can clearly see the road ahead. Video is shot at 30 frames per second in VGA resolution.

4.2. Results

        The lane detection algorithm is implemented in Matlab and takes approximately 0.8 seconds per frame. Tables 1 and 2 illustrate the performance of current and previous lane detection systems when applied to over 10 hours of captured video. The results in Table 1 show improved accuracy over the system described in [1] when tested with a similar dataset. The lack of coverage of other lane detection algorithms and turnkey software systems makes comparing results extremely difficult. Furthermore, defining ground truth for data is extremely tedious; therefore, it is generally avoided. Therefore, detection is qualitative and purely based on visual inspection by a single user. We use the following rules to quantify the results into different categories: 1) a correct detection occurs when more than 50% of the lane markings are estimated to cover the lane markings in the scene, 2) an incorrect detection occurs when the lane markings are estimated to cover other markers than the lane , and 3) missed detection occurs when there is no estimation although the relevant lane markings are visible. Take the average of the detection rates of the left and right markers to get the numbers in the table. Figure 4 shows some examples of correct lane detection. Results are expressed as detection rates per minute. This metric allows the results to be normalized when data is captured using cameras with different frame rates.

Table 1 Accuracy of current lane detection systems

Table 2 Accuracy of previous lane detection system [1]

Figure 4: Example of accurate lane line detection

        The Kalman filter recursively estimates the dynamics of the state vector despite noisy measurements. Figure 5 shows the comparison of observed and predicted values ​​of ρ.

        Figure 5: Comparison between observed and predicted values ​​of ρ over multiple frames. An exploded plot showing the Kalman filter smoothed noise measurements.

Figure 6: Example of incorrect lane line detection

        Figure 6 also shows some examples of incorrect lane detection. Fortunately, in Figure 6a, the Kalman filter is able to end within a few milliseconds of passing a collision on the road. However, in Figure 6b, the lack of lane markings due to road aging and wear results in the detection and tracking of false signals such as cracks.

5 Conclusion

        The work presented in this paper is a significant improvement over the hierarchical lane detection system proposed in [1]. The addition of features such as (1) Inverse Perspective Mapping (IPM), (2) Random Sampling Consistency (RANSAC), and (3) Kalman filtering adds novelty and scalability to previous systems. IPM helps simplify the process of finding candidate lane markings, while RANSAC helps exclude outliers in the estimation. Finally, the Kalman filter ignores smaller perturbations and keeps track of the lane marking sequence.

        The dataset used to test the accuracy of the proposed system was recorded on highways and streets in and around Atlanta, Georgia. Despite the variations in traffic conditions and road quality encountered, the proposed system performed well, as shown in Table 1.

6. Future work

        Lane Departure Warning (LDW) will be implemented in the future. It will leverage the ability of the lane detection system to accurately determine the distance to lane markings, as shown in Figure 4. In addition, the implemented algorithms will be ported to C# and C++ to facilitate the establishment of a real-time system. Future datasets will also include ground truth information to allow precise error calculations. Other users will also perform visual inspections. Finally, the image enhancement stage is made adaptive by calculating N as a function of vehicle speed.

Guess you like

Origin blog.csdn.net/weixin_41691854/article/details/134761195
Recommended