Lightweight Lane Line Detection

1. Prerequisite knowledge

First of all, you need to know that there are two main ways to implement the task of instance segmentation:

insert image description here

1. Top-down: first detect the frame, and then classify the pixels of the objects in the frame. The disadvantage is that the detection frame of some targets is incomplete, which will affect the accuracy;

2. From bottom to top, first classify the target pixels, and then classify the pixels to each target by clustering;

Summary: For the lane line scene, more bottom-up methods are used, because the scene of the lane line is complex, which is not conducive to detection;

2. Clustering algorithm

After classifying the points in a bottom-up manner, clustering is required to assign each pixel to a different target;

metric learningdistance metric learningsimilarity learning;

Note: This concept is involved in segmentation, key point detection, and face recognition tasks;

In the lane line task, the method of bottom-up clustering is adopted instead of the method of metric learning;

A loss function is used, paper: https://arxiv.org/pdf/1708.02551.pdf

insert image description here

The figure above shows the clustering effect of petals by this clustering method;

Several clustering methods are shown below:

insert image description here

K-means

Implementation steps:

insert image description here

The following figure shows the changes in Kmean during the clustering process:

insert image description here

advantage:

The principle is simple, easy to implement, and the clustering effect is good, which is suitable for conventional data sets;

shortcoming:

The selection of K value and initial value is difficult to determine, and the initial cluster center is sensitive;

The result obtained is only a local optimum;

It is good for distinguishing cluster data sets, but not effective for strip data sets;

K-means++ improvement

For the optimization strategy for initializing the centroid:

insert image description here

DBSCAN

Description: a clustering algorithm;

insert image description here

A: core object;

B, C: boundary line point;

N: Outlier point;

Process: Take A as the core point, continue to draw circles and spread outwards, and constantly look for points of the same category as the core object;

Advantage:

1. No need to specify the number of clusters;

2. Clusters of any shape can be found;

3. Good at finding outliers;

4. There is no bias in the clustering results. Relatively, the initial value of the clustering algorithm such as K-Means has a great influence on the clustering results;

Disadvantages:

1. High-dimensional data is somewhat difficult (can be used for dimensionality reduction)

2. When the density is uneven and the clustering distance is very different, the clustering effect is poor;

3. It is difficult to choose parameters (different parameters have a great influence on the results)

4. The efficiency is very slow;

Mean Shift

insert image description here

Explanation: As can be seen from the figure, through the initial centroid point, calculate a mean vector of all points in the selected area, iterate to find the next centroid point, and iterate this process until the final centroid is found;

Note: Some concepts of kernel functions are also used, which requires a certain understanding of this knowledge;

3. Introduction to LaneNet model structure

Compared to FCN for lane line detection, LaneNet

First look at an overall implementation diagram:

insert image description here

process:

First learn the shared features through the encoder (encoding), and then through two branches;

The above branch is the embedding branch, which performs clustering operations;

The following branch is the Segmentation branch, which performs upsampling to realize lane line segmentation;

Finally, the two branches are fused to obtain the effect of lane line instance segmentation;

multitasking learning

Of course, using different branches and having different loss functions for backpropagation involves a multi-task learning strategy;

Here is a common example of multi-task learning:

insert image description here

Explanation: The above figure is a multi-task learning of face attributes, in which the weights of the shallow feature maps are actually shared, and the loss values ​​returned for different tasks need to be returned in a weighted manner, and then backpropagation is performed to update the parameters ;

If you have been in contact with Faster RCNN, you should also know that in the end, it also uses two branches for detection and classification;

In fact, multi-task learning can also do some network variants, that is, change the way of shared convolution:

insert image description here

Of course, the convolution of the two branches can also be cross-shared:

insert image description here

Here we need to think about a question, how to set the weight of the loss function of different branches?

Provide an idea: output the Loss values ​​under different branches separately, and perform loss balance through the weight (that is, the normalization operation). For example: the loss value of one branch is 100, and the other branch is 1, then the weight ratio is required The balance is 1:100; then according to the importance of the task, the alignment is adjusted in a certain proportion;

Model Improvements

In order to achieve end-to-end, a layer of H-Net regression lane lines will be added;

insert image description here

That is to say, the H matrix can be understood as a transformation matrix. In fact, the effect is not necessarily better after adding the H matrix;

4. Clustering code implementation

1. Code implementation of Kmean algorithm

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

# 创建样本点
coordinate, type_index = make_blobs(
    # 1000个样本
    n_samples=1000,
    # 每个样本2个特征,代表x和y
    n_features=2,
    # 4个中心(聚类的类别)
    centers=4,
    # 随机数种子
    random_state=2
)

fig0, axi0 = plt.subplots(1)
# 传入x、y坐标,macker='o'代表打印一个圈,s=8代表尺寸
axi0.scatter(coordinate[:, 0], coordinate[:, 1], marker='o', s=8)
# 打印所有的点
plt.show()

insert image description here

color = np.array(['red', 'yellow','blue','black'])
fig1, axi1=plt.subplots(1)
# 下面显示每个点真实的类别
for i in range(4):
    axi1.scatter(
        coordinate[type_index == i, 0],
        coordinate[type_index == i, 1],
        marker='o',
        s=8,
        c=color[i]
    )
plt.show()

insert image description here

# 下面用kmeans去聚类得到的结果
y_pred = KMeans(n_clusters=4, random_state=9).fit_predict(coordinate)
plt.scatter(coordinate[:, 0], coordinate[:, 1], c=color[y_pred], s=8)
plt.show()

insert image description here

Conclusion: It can be seen that the clustering effect of Kmean is still good, except that the effect of some interference points is not good; it may be because the data is too simple, and it can achieve good results;

2、Kmean++

To improve the Kmean algorithm, you only need to add one more parameter when calling the function;

# 下面用kmeans++去聚类得到的结果
y_pred = KMeans(n_clusters=4, random_state=9, init='k-means++').fit_predict(coordinate)
plt.scatter(coordinate[:, 0], coordinate[:, 1], c=color[y_pred], s=8)
plt.show()

Conclusion: There is no big difference from Kmean in terms of results;

3. DBSCAN algorithm

# eps:半径,min_samples:使用附近的多少个样本来构建中心点
y_pred = DBSCAN(eps=1.8, min_samples=3).fit_predict(coordinate)
plt.scatter(coordinate[:, 0], coordinate[:, 1], c=y_pred, s=8)
plt.show()

In this algorithm, we use another distribution to construct sample points. The following figure shows the original sample points:

insert image description here

The result after passing DBSCAN:

insert image description here

Conclusion: The algorithm principle of DBSCAN is to judge the category based on the number of surrounding points, so the clusters that are not far apart will be regarded as a category, which can be solved by adjusting parameters;

4. Meanshift algorithm

# 估算半径
bandwidth = estimate_bandwidth(coordinate, quantile=0.2, n_samples=500)
# bin_seeding: 随便找一些点的中心来作为初始位置
meanShift = MeanShift(bandwidth=bandwidth, bin_seeding=True)
meanShift.fit(coordinate)
labels = meanShift.labels_
plt.scatter(coordinate[:, 0], coordinate[:, 1], c=labels, s=8)
plt.show()

Clustering effect diagram:

insert image description here

Conclusion: It can be seen that clustering based on the center of the circle, clusters that are too close will be classified into the same category;

5. Use the detection network to identify lane lines

Due to the use of semantic segmentation network for lane line detection, the performance is still too low. Unable to reach the level of landing application;

Turning the semantic segmentation task into a detection task, the performance of the model is improved, and the effect is not bad;

Paper address : https://arxiv.org/pdf/2004.11757.pdf

The following figure shows the specific method of the detection task:

insert image description here

Explanation: As can be seen from the figure, the input image is divided into a grid image, and whether there is a lane line is judged for each grid, and the location information is also saved;

The following introduces the data enhancement trick:

1. Crop: Since there is no lane line in the upper part of the lane line scene, according to the information marked in the data, the image of the upper part can be cut off to reduce the amount of calculation;

2. Rotation: data enhancement;

3. Vertical and longitudinal transformation: straighten the lane line and other operations;

4. Extend the lane line;

insert image description here

Network structure diagram:

insert image description here

Note: The above branch is mainly used in the training phase to distinguish lane lines. You only need to go to the lower branch when inferring. It can be seen that the overall structure of the network uses a ResNet backbone network, and subsequent access to full connection is realized. The transformation of the feature dimension finally obtains the grid position of the point where the lane line is located;

Loss function:

L total  = L c l s + α L s t r + β L s e g L_{\text {total }}=L_{c l s}+\alpha L_{s t r}+\beta L_{s e g} Ltotal =Lcls+αLstr+βLs e g

Description: classification loss + location loss + segmentation loss

Code combat

Source address: https://github.com/cfzd/Ultra-Fast-Lane-Detection

The pre-trained model can be downloaded in the project address:

insert image description here

According to the official conclusion, the effect of 306fps can be achieved under the Tusimple data set, but the actual running under 1080Ti only reaches 250fps;

1. Processing data

Function: According to the lane line regression performed by the area, the Tusimple data set is divided into four areas (because there are up to four lane lines in the data set), so the data set is labeled with lane lines by area;

python scripts/convert_tusimple.py --root $TUSIMPLEROOT

2. Modify the configuration file

Modify the data_root under config/tusimple.py to the path of the tusimple dataset;

3. Conduct training

python train.py configs/path_to_your_config

4. Perform visual testing

python demo.py configs/tusimple.py --test_model 

A test.avi file will be generated, which will display all the detected pictures as a video;

5. Test model inference speed

python speed.py

insert image description here

Randomly place a final rendering:

insert image description here

6. Summary

This time about the lane line detection project, I learned a lot of knowledge points:

  • The basic architecture and design of the segmentation network;

  • Clustering Algorithm;

  • How to implement multi-task learning;

  • Use detection instead of segmentation to realize lane line detection and improve performance;

The above knowledge points can be further understood in other directions:

  • Application of clustering algorithm in metric learning; (such as actually doing an image retrieval project)
  • Multi-task learning realizes multi-attribute classification and controls the network layer of weight sharing; (such as making a multi-classification of face attributes)

Guess you like

Origin blog.csdn.net/weixin_40620310/article/details/124249849