Curve Modeling: New Work for Lane Line Detection (CVPR2022)

Author丨StrongerTang

Source丨https://zhuanlan.zhihu.com/p/496228909

Editor丨Computer Vision Workshop

Sharing a new work on the direction of lane line detection that I read some time ago, it also won the 2022CVPR of the recently published results. It was completed by Shanghai Jiaotong University, East China Normal University, City University of Hong Kong and SenseTime, and the code has been open sourced.

b8635ed1e0028b64b826ec544b9855a3.png

Paper link: https://arxiv.org/abs/2203.02431

Code link: https://github.com/voldemortX/pytorch-auto-drive

Introduction

1e3d84e55e1449d2657b0515b1a218ee.png

Lane detection strategies

As shown in the figure above, deep learning-based lane line detection methods can be divided into three categories: segmentation-based schemes (green example in the figure), point detection-based schemes (blue example in the figure), and polynomial curve-based schemes ( example in yellow in the figure).

Among them, the schemes based on segmentation and point detection generally have better performance, but the representations based on segmentation schemes and point detection schemes are local and indirect, and the abstract factors (a, b, c, d) in polynomial curves are difficult to achieve. optimization. To this end, the article proposes a solution based on the cubic B´ezier curve, namely the red curve and the dashed box in the above figure, because the Bezier curve has the characteristics of easy calculation, stability, and freedom of conversion. In addition, the author also designs a feature flip fusion module based on deformable convolution to explore the symmetry properties of lane lines.

The final paper's scheme achieves new state-of-the-art performance on the lane line detection benchmark dataset LLAMAS while maintaining high speed (>150FPS) and small size (<10M), while being competitive on TuSimple and CULane datasets precision performance.

Supplements related to B´ezier curves

A Bezier curve (take 3rd order as an example) is a smooth curve drawn according to the coordinates of arbitrary points at four positions. It creates and edits graphics by controlling four points on the curve (start point, end point, and two separated intermediate points). The important part is the control line in the center of the curve. This line is virtual, intersects the Bezier curve in the middle, and controls the endpoints at both ends. When moving the endpoints at both ends, the Bezier curve changes the curvature (degree of bending) of the curve; when moving the middle point (that is, moving the virtual control line), the Bezier curve moves evenly with the start and end points locked .

abff0f8c1bccb69eaa1d3871be469f70.png cc9c92069f93781fad308a9489ee2e66.png 696ec71b9218fda36fb6f4d018bc6467.png

For any order Bezier curve, it can be expressed by the following formula:

c6e99c3669af3fa3b2228f3ca133dc2e.png

The article also conducts comparative experiments on Bezier curves and polynomial equation curves, as shown in the following table. The indicators in the table are the results on the TuSimple test set, the lower the better.

34e7a21db9e711611ac840a9661292e3.png

Through the above experiments, the article chooses to use the classic 3rd-order Bezier curve (n=3), because it is found in the experiment that the 3rd-order is sufficient for lane line modeling and has a better fit than the 3rd-order polynomial curve. ability, and the 3rd-order polynomial curve is the basic equation in many previous schemes (so said in the paper). In fact, Xiaotang learned from some of the work he participated in and his exchanges with peers that most of the schemes currently in actual mass production are also is a 3rd order polynomial curve scheme. The article also points out that higher-order curves do not bring corresponding performance gains, but can cause instability due to high degrees of freedom.

The Proposed Architecture

5f0b7719321b3b58b4fead390c30d86c.png

For the input RGB image, the feature map obtained by feature extraction is sent to the feature flip fusion module to obtain the feature map of CxH/16xW/16 size, and then the feature map of CxW/16 obtained by average pooling, and finally after a classification and A regression branch gets the corresponding Bezier curve results.

Feature Flip Fusion

403695c0079e04f5cf6fcdce78ecef52.png

The feature flipping module is one of the main works of this paper.

By modeling the lane lines as historical curves, the paper focuses on the geometric characteristics of each lane line, such as thin, long, continuous and other characteristics. When considering the global structure of lane lines from the perspective of the forward-looking camera, a road has spatially bisected lane lines, approximately symmetrical, eg. The presence of lane lines on the left may imply the presence of corresponding lane lines on the right. The article models this symmetry and designs a feature flipping module for this purpose.

an auxiliary binary segmentation branch

The article also designs an additional two-category segmentation branch on the ResNet backbone to enhance the learning of spatial details. And experimentally found that this extra branch only works when working with the feature flip fusion module. This is because the localization of the segmentation task is beneficial to provide a more spatially accurate feature map, which in turn supports more accurate fusion between flipped feature maps.

This additional binary segmentation branch is only used during training and turned off during inference.

91c4482a95a7a39cb8de5458e62d4143.png

The article illustrates the impact of this design through the Grad-CAM visualization shown above, and details can be found in the original text.

Overall Loss

97a2641fe6e2a20b135338c211bca143.png f690b573373cbfada68ae225c2aa8c0b.png 41ca67f7e7be2293ccc3226b55434eec.png bb1f18caede6a61f2cc471b2dc5f1666.png

Because there is no imbalance of positive and negative samples in the lane line detection dataset, a simple weighted cross-entropy loss is used for both classification and segmentation.

Experiments

94c19c75272f39d6398d0b48df40f6a6.png

Results on test set of CULane and TuSimple. *reproduced results in our code framework, best performance from three random runs. **reported from reliable open-source codes from the authors.

4fdf2db2645aa4a8fca0beedf5e95d8d.png f9517883441942ebd4843d8a6cb60ab8.png

ablation studies

04b9af4a2088ac469c7e42a9392024d1.png 83a98c5826dc2435ce8f3eab91356757.png

Visual example:

0dee4ec1e73be43d5ed4f7586a6e8bfc.png 1be5f92ad8988f967f38f365512784e5.png 215cbd02a57cfe7169d45a8840cfa10b.png

The article gives good results, but there are still problems in scenes such as ramps and big turns. Interested friends can run the code to see it. (The following four pictures are the results of running with the open source code and weight of the article. If you are interested, you can run it yourself)

bacf763ca8170e64bf0f37f94d54928b.png 8ffb3fca5647fc8681048173ae3d8729.png 624c3cd8aad5e1b06be1fca77427e541.png 475617dc9523bcc74ffc7526d6b090f4.png

Finally, since I don't read much about lane lines and my writing ability is limited, I welcome criticism and corrections if I have any mistakes. Friends who are interested in lane line detection, autonomous driving, computer vision, etc. are also welcome to join the autonomous driving exchange group to learn and play together! !

This article is for academic sharing only, if there is any infringement, please contact to delete the article.

Dry goods download and study

Backstage reply: Barcelona Autonomous University courseware, you can download the 3D Vison high-quality courseware accumulated by foreign universities for several years

Background reply: computer vision books, you can download the pdf of classic books in the field of 3D vision

Backstage reply: 3D vision courses, you can learn excellent courses in the field of 3D vision

Computer Vision Workshop official website: 3dcver.com

1. Multi-sensor data fusion technology for autonomous driving

2. A full-stack learning route for 3D point cloud target detection in the field of autonomous driving! (Single-modal + multi-modal/data + code)
3. Thoroughly understand visual 3D reconstruction: principle analysis, code explanation, and optimization and improvement
4. The first domestic point cloud processing course for industrial-level combat
5. Laser-vision -IMU-GPS fusion SLAM algorithm sorting
and code
explanation
Indoor and outdoor laser SLAM key algorithm principle, code and actual combat (cartographer + LOAM + LIO-SAM)

9. Build a structured light 3D reconstruction system from scratch [theory + source code + practice]

10. Monocular depth estimation method: algorithm sorting and code implementation

11. The actual deployment of deep learning models in autonomous driving

12. Camera model and calibration (monocular + binocular + fisheye)

13. Heavy! Quadcopters: Algorithms and Practice

14. ROS2 from entry to mastery: theory and practice

15. The first 3D defect detection tutorial in China: theory, source code and actual combat

Heavy! Computer Vision Workshop - Learning Exchange Group has been established

Scan the code to add a WeChat assistant, and you can apply to join the 3D Vision Workshop - Academic Paper Writing and Submission WeChat exchange group, which aims to exchange writing and submission matters such as top conferences, top journals, SCI, and EI.

At the same time , you can also apply to join our subdivision direction exchange group. At present, there are mainly ORB-SLAM series source code learning, 3D vision , CV & deep learning , SLAM , 3D reconstruction , point cloud post-processing , automatic driving, CV introduction, 3D measurement, VR /AR, 3D face recognition, medical imaging, defect detection, pedestrian re-identification, target tracking, visual product landing, visual competition, license plate recognition, hardware selection, depth estimation, academic exchanges, job search exchanges and other WeChat groups, please scan the following WeChat account plus group, remarks: "research direction + school/company + nickname", for example: "3D vision + Shanghai Jiaotong University + Jingjing". Please remark according to the format, otherwise it will not be approved. After the addition is successful, the relevant WeChat group will be invited according to the research direction. Please contact for original submissions .

309a80114d2d891afb27d4ab8762da96.png

▲Long press to add WeChat group or contribute

f73dfed59cd823738d932d9e0e7990b8.png

▲Long press to follow the official account

3D vision from entry to proficient knowledge planet : video courses for the field of 3D vision ( 3D reconstruction series , 3D point cloud series , structured light series , hand-eye calibration , camera calibration , laser/vision SLAM, automatic driving, etc. ), summary of knowledge points , entry and advanced learning route, the latest paper sharing, and question answering for in-depth cultivation, and technical guidance from algorithm engineers from various large factories. At the same time, Planet will cooperate with well-known companies to release 3D vision-related algorithm development jobs and project docking information, creating a gathering area for die-hard fans that integrates technology and employment. Nearly 4,000 Planet members make common progress and knowledge to create a better AI world. Planet Entrance:

Learn the core technology of 3D vision, scan and view the introduction, unconditional refund within 3 days

1de294c5821efbe6743445b242301269.png

 There are high-quality tutorial materials in the circle, which can answer questions and help you solve problems efficiently

I find it useful, please give a like and watch~

Guess you like

Origin blog.csdn.net/qq_29462849/article/details/124441297