Classic literature reading--Online Map Vectorization for Autonomous Driving: (online map vectorization based on rasterization)

0. Introduction

High-precision maps in autonomous driving are very important for vehicle positioning. Generally speaking, high-precision maps take a lot of time to complete. With the development of deep learning, using deep learning to vectorize maps is a very useful operation. Vectorized high-precision (HD) maps are critical for autonomous driving, providing detailed and precise environmental information for advanced perception and planning. However, current map vectorization methods often suffer from biases, and existing map vectorization evaluation metrics lack sufficient sensitivity to detect these biases.

In order to solve these limitations, "Online Map Vectorization for Autonomous Driving: A Rasterization Perspective" proposes to integrate the idea of ​​rasterization into The map is being vectorized. Specifically, this paper introduces a new rasterization-based evaluation metric, which has excellent sensitivity and is more suitable for real-world autonomous driving scenarios. In addition, this paper proposes MapVR (Rasterization-based Map Vectorization), a novel framework that applies differentiable rasterization to the vectorized output and then accurately performs rasterized high-precision maps. and geometry-aware supervised learning.

The relevant code has been open sourced onGithub. There are very few open source articles of this kind. If you are interested in this field, you can take a good look at this article

Insert image description here

Figure 1. (a) Map rasterization generates high-definition semantic maps as output via semantic segmentation in Bird’s Eye View (BEV). (b) Map vectorization directly predicts compact and instance-level vectorized map elements, which is more suitable for actual autonomous driving systems. © MapVR uses differentiable rasterization to connect vectorized and rasterized HD map representations, providing more precise and accurate vectorized HD maps for reliable autonomous driving.

1. Main contributions

VectorMapNet [30] and MapTR [22] both use sparse point set representation, where each map element is parameterized as a fixed-length vector of equally spaced sample points, and Apply L1 loss to supervise regression prediction. Although this approach is simple and intuitive, we empirically demonstrate that it is often suboptimal for several reasons. First, as shown in Figure 2, sparsepoint set representations are often insufficient in accuracy, especially when dealing with sharp turns or complex details of map structures, resulting in significant parameterization errors a >. Likewise, current evaluation metrics rely on the Chamfer distance between point sets, often ignoring subtle biases and geometric details. For autonomous driving, accuracy is a matter of life and death, and existing map vectorization methods and evaluation metrics are still insufficient. . Third, relying solely on L1 loss for regression supervision will cause the model to ignore fine-grained geometric changes, produce overly smooth prediction results, and be insensitive to local deviationsusing equidistant points as regression targets for learning leads to unclear supervision of the model, since intermediate points often lack clear visual cues. Second,

Insert image description here

Figure 2. Inaccurate map elements caused by parameterization of sparse isometric point sets

To address these limitations, we reintroduced the idea of ​​rasterization into map vectorization to restore the advantages of high-precision map modeling while maintaining the advantages of vectorized output. We believe that rasterization can provide complementary benefits to map vectorization. The contributions of this article are summarized as follows:

1) This article proposes a new rasterization-based map vectorization evaluation index, which has increased sensitivity to small deviations, allowing for a more accurate and reasonable evaluation of map vectorization performance in real-world driving scenarios;

2) This paper proposes MapVR (rasterization-based map vectorization), a new framework that seamlessly combines differentiable rasterization with existing map vectorization methods. MapVR significantly improves the accuracy of map vectorization, shows strong scalability for different map elements, and does not incur additional computational overhead during the inference process;

3) The MapVR framework and evaluation indicators proposed in this article paves the way for future research and improvement of autonomous driving applications, demonstrating the complementary advantages between rasterization and map vectorization.


2. Rasterization-based map vectorization evaluation index

2.1 Review of evaluation indicators based on Chamfer distance

Map vectorization requires instance-level evaluation, similar to object detection [3, 8, 25, 57–61, 65]. Therefore, current map vectorization methods [7, 18, 22, 30] adopt average precision (AP) to evaluate the accuracy of map construction and use the Chamfer distance to determine the relationship between predicted map elements and Whether the real map element matches. Specifically, Chamfer distance D C h a m f e r ( ⋅ , ⋅ ) D_{Chamfer}(·, ·) DChamfer(⋅,⋅) is a dissimilarity measure between two unordered point sets, which quantifies the average distance of each point in one set to the nearest point in the other set. It can be expressed as:

Insert image description here

Among P P P Q Q Q represents the point set of predicted map elements and real map elements respectively, ∣ P ∣ |P| P ∣ Q ∣ |Q| QDisplay point collection P P P Q Q QDefine, ∣ p − q ∣ 2 |p − q|^2 pq2display point p p p q q The Euclidean distance between q.

Although this measurement method is simple and can provide unbiased evaluation results, the following limitations make it insufficient for use in high-demand scenarios such as autonomous driving: 1) It does nothave a scale Invariance; for smaller map elements (such as stop lines), the Chamfer distance error is always too small to provide a meaningful evaluation. 2) Chamfer distance only relies on the distance of unordered point sets, completely ignores the shape and geometric details of map elements, and therefore produces unreasonable results in many practical scenarios. The results are shown in Figure 4. These shortcomings require the development of more powerful and accurate evaluation metrics to meet the stringent requirements for map vectorization for autonomous driving.

Insert image description here

Figure 4. Comparison of evaluation quality between Chamfer distance-based measurement method and our proposed rasterization-based measurement method in several practical cases. Our measurement method can produce more reasonable evaluation results and is suitable for autonomous driving applications.

2.2 Proposed rasterization-based evaluation index

To address the above limitations, we introduce a rasterization-based evaluation metric that is more sensitive tosmall deviations and more applicable to real-world driving scenarios a>. While we still use AP as our measure, we employ rasterization to precisely determine the match between predicted and ground-truth map elements. As shown in Figure 3, we use linear map elements such as lanes and curbs to demonstrate our metrics. First, both ground truth and predicted elements are rasterized into polylines in the HD map. In our setting, considering the sensing range of ±30m on the y-axis and ±15m on the x-axis, we set the spatial size of the HD map to 480×240, so that each pixel represents 0.125m, meeting the high accuracy of autonomous driving Require. To better accommodate prediction inaccuracies with elongated geometries, we inflate the rasterized polylines by 2 pixels on each side, thus introducing appropriate tolerance. Finally, to determine whether ground truth elements and predicted map elements match, we compute the Intersection over Union (IoU) of their respective rasterized HD representations. Similar to MS-COCO's metric [25], we calculate AP under multiple IoU thresholds. For linear elements, we set the threshold to 0.25 : 0.50 : 0.05 0.25:0.50:0.05 0.25:0.50:0.05

It's worth noting that HD maps often contain elements other than lines, such as crosswalks, intersections, and parking lots. These elements can be abstracted into polygons. To allow for proper evaluation of polygon-shaped map elements, we used a specially designed polygonal rasterization instead of a linear rasterization, and performed 0.50 : 0.75 : 0.05 0.50:0.75: 0.05 0.50:0.75:0.05Upper calculation AP.

Insert image description here

Figure 3. Demonstrates a rasterization-based approach to determining matches between ground truth data and predicted vectorized map elements.

3. MapVR

…For details, please refer toGuyueju

Guess you like

Origin blog.csdn.net/lovely_yoshino/article/details/131441712