Classic literature reading--Traversability Analysis for Autonomous Driving...(Lidar traversability analysis in complex environments)

0. Introduction

For autonomous driving, the task of navigating complex environments is the most important task. The article "Traversability Analysis for Autonomous Driving in Complex Environment: A LiDAR-based Terrain Modeling Approach" proposes to use lidar to complete mapping work, which can Output stable, complete and accurate terrain modeling and trafficability analysis results. Since terrain is an inherent attribute of the environment and does not change with different viewing angles, this method uses a multi-frame information fusion strategy for terrain modeling. Specifically, this paper adopts a normal distribution transform mapping method to accurately model terrain by fusing information from consecutive lidar frames. Spatial-temporal Bayesian generalized kernel inference and bilateral filtering are then utilized to improve the stability and completeness of the results while preserving the edges of sharp terrain. Based on the terrain modeling results, the accessibility of each area is obtained by performing geometric connectivity analysis between adjacent terrain areas.

1. Main contributions

The contributions of this article are summarized as follows:

1) This article makes full use of the information provided by continuous lidar frames for passable analysis, rather than treating it as a single-frame task. This paper uses an NDT mapping method to model terrain. In addition, we also considerthe quantified error between the global grid map and the local grid map. By adopting this multi-frame fusion method, some estimation errors can be easily avoided, and the estimation results are more likely to be stable and complete;

2) This article proposes a space-time BGK height inference method. Compared with the original BGK method, we made two improvements. The first improvement is that we introduce bilateral filtering into BGK height inference, thereby alleviating the edge blur problem. The second improvement is to introduce the height variance estimated by the NDT mapping method as a weight in BGK inference. With the help of this weight, grid cells with larger variance contribute less to the height inference. By applying these two improvements, the estimated terrain model and trafficability analysis results can be more accurate;

3) By analyzing the geometric connectivity properties between adjacent terrain units, we can obtain a cost map. This cost map helps differentiate between different terrain types such as curbs, ditches, ramps, and road boundaries. Therefore, the method proposed in this article can help the UGV path planning module select reasonable and safe paths in complex environments.

2. Summary of methods

The framework of the feasibility analysis method proposed by is shown in Figure 2. The inputs are LiDAR point clouds and high-frequency 6 degrees of freedom (DoF) attitude sequences from an attitude estimation module [Xue et al., 2019], which fuses information from an inertial navigation system (INS), wheel encoders, and LiDAR odometry. . The output of this method is a dense terrain model and a cost map of the local environment. The terrain model is represented by a normal vector map and a dense elevation map. The proposed method mainly includes four modules:data preprocessing and coarse segmentation module, elevation estimation module, elevation prediction and refinement module, and feasibility analysis module. The details of these four modules are described below.
Insert image description here

Figure 2: Framework of the proposed feasibility analysis method

2. Data preprocessing and rough segmentation

It is known that as the UGV moves during a LiDAR scan, the LiDAR point cloud becomes distorted. To correct the distorted point cloud, the pose generated by the high-frequency 6-DOF pose estimation module is used for intra-frame motion compensation. In addition, the point cloud is also rotated to the vertical position based on the azimuth, roll, and pitch angles obtained from the 6-DOF attitude. After this rotation, terrain properties (such as normal vectors, slope, etc.) can be well estimated in axis-aligned coordinates. The rectified point cloud is then projected into a 2D grid map, and the grid cells are roughly classified into terrain units and non-topography units using the min-max height difference method [Thrun et al., 2006]. In this method, a height difference threshold Th needs to be set (the settings of this threshold are listed in Table 2). If the minimum-maximum height difference of a grid cell is greater than Th, it is considered an obstacle cell. Additionally, the method introduced in [Jaspers et al., 2017] was used to remove hanging structures from the point cloud.

After this rough classification, the 2D grid map is divided into three parts:Obstacle grid cells, potential terrain grid cells, and unobserved grid cells< /span> x i x_i . At this stage, it is worth mentioning that potential terrain grid cells may contain false positives. For example, the roofs of nearby vehicles may be mistakenly judged as potential terrain cells. These errors may not be easily corrected by processing a single LiDAR frame, but can be easily avoided by employing the multi-frame information fusion strategy described in subsequent subsections. For each potential terrain grid cell xi, whose observed height can be obtained by fusing all n i n_i niPoint observation height { z i , j } j = 1 : n i \{z_i,j\}_{j=1:n_i} { zi,j}j=1:niRich model positive distribution N ( µ i , Σ i ) N(µ_i, Σ_i) N(µi,Si)
Insert image description here
inside µ i µ_i misum Σ i Σ_i SiSeparately N ( µ i , Σ i ) N(µ_i, Σ_i) N(µiΣiThe mean and variance of ). Figure 3 shows an illustrative example of the elevation distribution generated from a single LiDAR frame.
Insert image description here

Figure 3: Example illustration of elevation distribution generated from a single LiDAR frame. The elevation of each potential terrain grid cell is represented by a normal distribution (shown as an ellipsoid, with larger radii indicating higher variance)

3. Height estimation based on multi-frame information fusion

In order to fuse the information of consecutive LiDAR frames,the rolling grid technology [Behley and Stachniss, 2018] is used to construct the global grid map. As shown in Figure 4, the size of the global grid map is W × W W×W IN×W, each grid cell represents a ω × ω ω×ω oh×The area of ​​ω is stored in the previous time step t − 1 t-1 t1Chinese altitude distribution ( W W Wsum ω ω The settings for ω are listed in Table 2). As the UGV moves, historical grid cells beyond the map boundaries will be removed (gray cells in Figure 4), and the same number of new cells will be generated in the direction of movement.
Insert image description here

Figure 4: Update process of global raster map. The size of the global raster map is W × W W×W IN×W, the resolution of each grid cell is ω × ω ω×ω oh×ω. As unmanned ground vehicles move, historical gray grid cells that extend beyond the map boundaries will be removed. The shaded green area indicates the lidar's observation range. The orange circle indicates the lidar location, and the blue circle is the center of the raster map. ( r x t , r y t ) (r^t_x , r^t_y) (rxt,randt) is the time step t t Residual error calculated at t

The center of the global grid map is defined as the lower left corner of the grid cell where the LiDAR is located (shown as the blue circle in Figure 4). Suppose ( L x t , L y t ) (L^t_x , L^t_y) (Lxt,Landt) represents the position of LiDAR in the global coordinate system, expressed as time steps t t t (shown as the orange circle in Figure 4), then we can calculate the LiDAR relative to the lower left of the grid unit where it is located Residual error of angle ( r x t , r y t ) (r^t_x , r^t_y) (rxt,randt) .
Insert image description here
The function Floor(a) is a rounding operator, returning no more than a a The largest integer of a.
Since the center of the global map is always defined as the lower left corner of the grid cell where the LiDAR is located, the calculated residual exactly represents the LiDAR coordinates and the global map coordinates The translation offset between. Suppose ( p x t , p y t , p z t ) (p^t_x , p^t_y , p^t_z) (pxt,pandt,pWitht) represents the observation point in the LiDAR coordinate system, and its global grid coordinates ( r , c ) (r, c) (r,c) can be calculated.
Insert image description here
In previous research, point clouds are usually processed in the lidar's local coordinate system, where the center of the lidar map is aligned with the lidar origin. The output of this process is usually a local grid map, which is then fused into a global grid map. In this paper, we emphasize that the global grid map is discretized by global coordinates, so the local coordinates of the lidar should first be compensated by the calculated residuals. With this value, the local grid map of the lidar will be aligned with the global grid map, which can well eliminate the quantization error caused by different lidar attitude.

The information of consecutive frames will then be fused in the global grid map. For each grid cell xi containing a projected point, the timestamp t t is first calculated by formula (1)t下的观测高度分布 N ( µ i t , Σ i t ) N(µ^t_i , Σ^t_i ) N(µit,Sit),会这些将 N ( µ i t , Σ i t ) N(µ^t_i , Σ^t_i ) N(µit,Sit)With the previous joint height distribution N ^ ( µ ^ i t , Σ ^ i t ) \hat{N}(\hat{ µ}^t_i , \hat{Σ}^t_i) N^(m^it,S^it) Perform fusion to estimate the current joint height distribution N ^ ( µ ^ i t − 1 , Σ ^ i t − 1 ) \hat {N}(\hat{µ}^{t-1}_i , \hat{Σ}^{t-1}_i) N^(m^it1,S^it1)

Insert image description here

Figure 5: An illustrative example showing the estimated elevation distribution from the fusion of information from multiple frames. The elevation distribution generated by each lidar frame is fused in a global grid map

An illustrative example of the elevation distribution estimated by multi-frame information fusion is shown in Figure 5. The elevation distribution generated by each individual LiDAR frame is fused in a global grid map, resulting in a more stable and complete result. In addition, the stable distribution variance information allows for fine segmentation of potential terrain regions, if the estimated variance of a grid cell that has been observed multiple times is above the variance threshold T Σ T_Σ a>TΣ(The settings of this threshold are listed in Table 2), then the grid cell will also be regarded as an obstacle cell.

4. Elevation prediction and refinement based on spatiotemporal BGK reasoning

This paragraph further describes an improved spatial-temporal BGK inference method to predict and refine the height distribution of grid cells. The input is a set of potential terrain cells O t O^t Ot, inside N i t N_i^t Nit represents the time step t t t处的网格单源 x i x_i xiPredicted height distribution of , N o t N^t_o NOtwave O t O^t OThe number of input samples in t. The task is based on input O t O^t OtEstimated target grid unit x ∗ x_∗ xHeight distribution of N ∗ t N^t_∗ Nt, and transform it into a regression problem.
Insert image description here

The space-time BGK inference method applies Bayes' theorem, conditional independence assumption and a smooth extended likelihood model. This method also considers each grid cell x i x_i xiRelated prediction variance information Σ ˆ t i Σˆt_i Σˆtias weight. Grid cells with larger variance will contribute less to the high-level inference process.

The goal of the space-time BGK inference method is to estimate the target grid cell x ∗ x_∗ xaltitude h ∗ h_∗ h. In addition, edge-preserving filtering technology, especially bilateral filtering, is also used to solve the problem that Gaussian filtering may cause edge blur. This is especially important in areas with significant changes in terrain.
Insert image description here

In this inference process, each potential terrain unit x i x_i xiThe height of is first estimated by formula (9), and then the difference between the estimated height and the observed height is calculated δ i t δ^t_i dit. For grid cells with sharp changes in terrain, the difference δ i t δ^t_i dit is usually very large. Therefore, δ i t δ^t_i ditTo be trampled on w i t w^t_i Init
Insert image description here

Finally, this process is illustrated in Figure 7, where the green cells represent the observed potential terrain grid cells and the red cells represent the target grid cells that need to be estimated. Within the effective support range l of the kernel (indicated by the gray circle color), each green unit contributes to the estimate of the red unit, and the degree of contribution depends on three factors: the variance of the distribution estimated in Section 3.2 Σ i t Σ^t_i Sit, distance from red unit d i d_i di, and estimated-observed differences for bilateral filtering δ i t δ^t_i dit
Insert image description here

Figure 6: Normal vector diagram before bilateral filtering (a) and after bilateral filtering (b). The color codes the angle between the estimated normal vector and the vertical axis. Higher degrees of green indicate flatter terrain.
Figure 7: Schematic diagram of the BGK inference process. Green cells represent observed potential terrain grid cells, and red cells represent target grid cells to be estimated. Within the kernel function's effective support range l (marked by the shaded gray circle), each green grid cell contributes to the red cell's estimate, d i d_i diRepresents its distance from the red cell.

…For details, please refer toGuyueju

Guess you like

Origin blog.csdn.net/lovely_yoshino/article/details/131785561