Classic literature reading--A Review of Motion Planning (trajectory planning review)

0. Introduction

For autonomous driving and robots, in addition to SLAM, another important part is trajectory planning. Recently, the author has seen several good articles, namely " A Review of Motion Planning Techniques for Automated Vehicle ", " A review of motion planning algorithms for intelligent robots ", " A review of motion planning for highway autonomous driving "Here Combining the main points explained in each article, I will give you a popular science on trajectory planning, so as to understand the gaps and challenges that need to be resolved in the next few years.

1. Traditional algorithms in the field of robotics

Robot planning algorithms can be divided into two categories according to their principles and age of invention: traditional algorithms and ML-based algorithms. Traditional algorithms consist of four groups, including graph search algorithms (such as A*), sampling-based algorithms such as Rapidly Exploring Random Trees (RRT), interpolating curve algorithms (such as Line and Circle), and reaction-based algorithms (such as DWA). ML-based planning algorithms include classic ML algorithms such as support vector machines (SVM), optimal value RL such as deep Q-learning networks (DQN), and policy gradient RL such as the actor-critic algorithm. The following figure summarizes the categories of planning algorithms.
insert image description here

2. Machine & Reinforcement Learning Algorithms in Robotics

The other part is the development of ML-based algorithms. For example, classic ML, such as SVM, was used to achieve simple motion planning in the early stage, but its performance is poor, because the one-step prediction of SVM is short-sighted. It requires carefully prepared vectors as input, which cannot fully represent the features of image-based datasets. After the invention of Convolutional Neural Networks (CNN), the work of extracting high-level features from images has been greatly improved (Lecun et al., 1998). CNN is widely used in many image-related tasks, including motion planning, but it cannot handle complex time-series motion planning problems. These are more suitable for Markov chains (Chan et al., 2012) and long short-term memory (LSTM) (Inoue et al., 2019). The neural network is then combined with LSTM or Markov chain-based algorithms (e.g., Q-learning (Smart & Kaelbling, 2002)) for time-series motion planning. However, its efficiency is limited (e.g., poor performance on network convergence). A breakthrough was made when Google DeepMind introduced natural DQN (Mnih et al., 2013, 2015), where the reply buffer is to reuse old data for efficiency. However, performance is limited in terms of robustness due to noise affecting the estimation of the state-action value (Q-value). Therefore, double DQN (Hasselt et al., 2016; Sui et al., 2018) and duel DQN (Wang et al., 2015) were invented to deal with the problem caused by noise. Dual DQN utilizes another network to evaluate the estimation of Q value in DQN to reduce noise, while in duel DQN utilizes dominant value (A value) to get better Q value, noise is mostly reduced. Q-learning, DQN, dual-DQN, and duel-DQN are all based on optimal values ​​(Q-value and A-value) to select the best time-sequential action.

The optimal value algorithm was later replaced by the policy gradient method (Sutton et al., 1999), where the gradient method (Zhang, 2019) was directly exploited to upgrade the policy for generating optimal actions. The policy gradient method is relatively stable in terms of network convergence, but lacks efficiency in terms of network convergence speed. The actor-critic algorithm ((Cormen et al., 2009; Konda & Tsitsiklis, 2001)) improves the convergence speed through the actor-critic architecture. However, the increased convergence speed comes at the expense of the stability of the convergence, so the network of the actor-critic algorithm has difficulty in converging in the early stages of training. Thus invented Asynchronous Advantage Actor Criticism (A3C) (Gilhyun, 2018; Mnih et al., 2016), Advantage Actor Criticism (A2C) Football Agent 1 (Babaeizadeh et al., 2016), Trust Region Policy Optimization (TRPO) (Schulman et al. , 2017a) and approximate policy optimization (PPO) (Schulman et al., 2017b) algorithms to address this shortcoming. A3C and A2C utilize multi-threading technology (Mnih et al., 2016) to speed up the convergence, while TRPO and PPO improve the performance of behavioral criticism algorithms by introducing trust region constraints in TRPO, and introducing "surrogate" and adaptive penalties in PPO. strategy to improve convergence speed and stability. However, data is discarded after training, so new data must be collected to train the network until the network converges.

Non-policy gradient algorithms including deterministic policy gradient (DPG) (Silver et al., 2014) and deep DPG ((Lillicrap et al., 2019; Munos et al., 2016)) were invented to reuse the data. DDPG incorporates behavior-critical architectures and deterministic strategies to improve convergence speed. In summary, classical ML, optimal value RL, and policy gradient RL are typical ML algorithms in robot motion planning, and the development of these ML-based motion planning algorithms is shown in Figure 5.
insert image description here

3. Traditional algorithms in the field of autonomous driving

The application of intelligent transportation systems has significantly helped drivers reduce some of the tedious tasks associated with driving. Specifically, highway driving has become much safer thanks to the development of cruise control (CC), adaptive cruise control (ACC) and more recently cooperative ACC (CACC), where pre-defined spacing to control the longitudinal actuators, accelerator and brake pedals. To improve the overall safety, comfort, traffic time and energy consumption of the vehicle. This type of system is called an advanced driver assistance system (ADAS). The following figure shows the general framework of an autonomous vehicle. For autonomous driving, perception, decision-making, and control are the most important. The planning we are mainly talking about is at the decision-making level

insert image description here
Similar to robotics, this part of the work is equally applicable to key aspects of robot navigation, as it provides global and local trajectory planning to describe robot behavior. It considers the dynamics and kinematics model of the robot from the starting position to the final position. The main difference in performing motion planning between a vehicle and a robot is that the former solves a road network where traffic rules must be obeyed, while the latter has to deal with an open environment where there are not many rules to follow, it only needs to reach the final destination.

For automatic driving, it only needs the following functions. The first two parts do not belong to motion planning, so they are beyond the scope of this article. The following three points are mainly discussed

  • Route planning: Long-distance planning from origin to destination.
  • Prediction: Predict the movement of surrounding objects through stored current and historical dynamic information. For example: road information, changes in lane lines, road traffic rules, and the behavior of surrounding vehicles.
  • decision making:
  • generation:
  • Deformation:

Path planning in mobile robots has become a research topic over the past few decades. Most authors divide the problem into global planning and local planning.
insert image description here
A lot of the navigation technology comes from mobile robots, it's just that the autonomous driving will be modified according to the rules. These planning techniques are classified into four groups according to their implementation in autonomous driving: graph search, sampling, interpolation, and numerical optimization (see Table I). The most relevant path planning algorithms implemented in motion planning for autonomous driving are described below. Let's briefly describe each model
insert image description here

2.1 Planner based on graph search

Dijkstra's Algorithm : It is a graph search algorithm that finds the single-source shortest path in a graph. The configuration space is approximated as a discrete grid cell space, lattice, etc.

A-star Algorithm (A ) *: It is a graph search algorithm that is capable of fast node searches due to the implementation of a heuristic function (it is an extension of Dijkstra's graph search algorithm). Its most important design is the determination of the cost function. Some applications in mobile robots have been used as the basis for improvement, such as Dynamic A* (D*), Field D*, Theta*, Anytime repairing (ARA*) and Anytime D* (AD*), etc.

State Lattice Algorithm: This algorithm uses a discrete representation of the planning region with a grid state (usually a hyperdimensional state). This grid is called the state lattice, on which the motion planning search is applied. Pathfinding in this algorithm is based on a local query of a set of lattices or primitives containing all feasible features, allowing the vehicle to travel from an initial state to some other state. The cost function determines the best path between precomputed lattices. Apply the node search algorithm through different implementations (eg A or D ).

2.2 Sampling-based planner

Rapidly Exploring Random Trees (RRT) : It belongs to sampling-based algorithms, which are suitable for online path planning. It allows fast planning in semi-structured spaces by performing a random search in the navigation area, and is also able to account for nonholonomic constraints such as the vehicle's maximum turning radius and momentum. However, the generated path is not optimal, its path has abrupt changes and is not curvature continuous. This new implementation of RRT* converges to an optimal solution.

2.3 Interpolation curve planner

Techniques such as computer-aided geometric design (CAGD) are often used as path smoothing solutions for a given set of road points. These allow a motion planner to fit a given road description by taking into account feasibility, comfort, vehicle dynamics, and other parameters in order to plot trajectories.

Lines and Circles : Different segments of a road network can be represented by interpolating known waypoints using lines and circles.

Spiral Curve : This type of curve is defined in terms of Fresnel integrals. The use of helical curves enables the definition of trajectories with continuously varying curvatures, since their curvature is equivalent to their arc length, which allows smooth transitions between straight and curved segments and vice versa. Helixes have been used in the design of roads and railways, as well as in vehicle-like robots.

Polynomial curves : These curves are usually implemented to satisfy the constraints needed to interpolate points, i.e. are useful in fitting position, angle and curvature constraints etc.

Bezier curves : These are parametric curves that rely on control points to define their shape. At the heart of Bezier curves are Bernstein polynomials. These curves have been widely used in CAGD applications, technical drawing, aerospace and automotive design.

Spline curve : A spline curve is a piecewise polynomial parameterized curve divided on a subinterval, which can be defined as a polynomial curve, a b-spline curve (which can also be represented by a Bezier curve) or a spiral curve. The connections between each subsegment are called nodes, and they typically have highly smooth constraints at the junction of the splines.

2.4 Numerical optimization

Function Optimization : This technique finds the real roots of a function (minimizing a variable output). It has been implemented to improve the Potential Field Method (PFM) for obstacles and narrow passages in mobile robots

4. Intelligent algorithms in the field of autonomous driving

Motion planning is usually divided into high-level planning and low-level planning:

  • Advanced prediction: What needs to be done is to make decisions and generate a series of candidate behaviors through the analysis of the environment and the assessment of sports risks. Similar to the human brain, it makes instructions for behavior.
  • Low-level responses: Morph-generated motion from high-level planning. Similar to the cerebellum, it does not require almost no thinking to generate movement, and there will be an emergency response, which makes the real trajectory and path different.

In this section, we will carry out a more detailed division of the content of the previous section.
insert image description here

4.1 Spatial configuration analysis (that is, how do we represent the map)

Spatial configuration analysis is a decomposition of alternative evolution spaces. It is an ensemble algorithm mainly used for motion generation or deformation when specifying. These methods are based on geometrical aspects; they refer to predictive methods with coarser decompositions to limit computation time, or finer distributions for more accurate responses. The main difficulty is finding the correct spatial configuration parameters for good representation of motion and environment [41]. If the discretization is too coarse, the collision risk will be well accounted for and it is impossible to respect the kinematic constraints between two consecutive decompositions; however, if the discretization is too fine, the real-time performance of the algorithm will be poor. We divide the spatial decomposition into three main sub-families, as shown in the figure below: Sampling Points, Connected Units, and Lattice

insert image description here
Sampling-Based Decomposition : The most popular stochastic approach is the Probabilistic Road Map (PRM) [41]. It uses random sampling picked in the evolution space during the construction phase. These sampled points are connected with their neighbors to form an obstacle-free roadmap, which is then solved in the second query phase by a pathfinding algorithm, such as Dijkstra (see III-B2) [42]. In [33], the authors first sample the configuration space according to a reference path, such as the centerline of a small lane, then select the best set of sampling points according to an objective function, and finally assign a velocity profile to the path to respect safety and comfort criteria.

Connectivity-based cell decomposition : These methods first use geometry to decompose space into cells, and then construct an occupancy grid and/or cell connectivity graph, as shown in the figure below for an application example. In the method of occupying the grid, a grid is generated around the car. Obstacle detection information is superimposed on the grid. In the connected graph approach, nodes represent units and edges are the adjacencies between units. The graph can be interpreted as paths along the edges of cells or paths sought within connected cells. The main methods are visibility decomposition, Voronoi decomposition, driving corridor, Vector Field Histograms (VFH), exact decomposition, Dynamic Window (DW). Lattice representation
insert image description here
: In motion planning, Lattice is a regular spatial structure, is a generalization of grids [22]. Motion primitives can be defined to connect exactly one state of a Lattice to another. All feasible state evolutions resulting from the grid are represented as maneuverable accessibility graphs. Lattice representations compile both road boundaries and kinematic constraints, and can be quickly replanned, which is useful for highway planning.

4.2 Pathfinding algorithm

The pathfinding algorithm family is a branch of graph theory in operations research, which is used to solve combinatorial probability problems under graph representation. The graph can be weighted or directed with sampling points, cells, or manipulation nodes. The rationale is to find paths in a graph to optimize a cost function. Such as Dijkstra, A*, Anytime Weighted A* (AWA*), hybrid-state A*, D*, RRT, RRT* and so on. The details have already been mentioned above.

Similar to sampling-based decomposition, probabilistic graph search is not well suited for the structured environment of highways. Also, the highway is usually a known environment. In this sense, deterministic wayfinding is favored in highway motion planning for autonomous vehicles.
insert image description here

4.3 Attractive and repulsive forces

The method of attraction and repulsion is a biomimicry method. The sign of the evolution space is the attraction signified into the desired motion (eg legal velocity). Repulsive forces of obstacles (e.g. road boundaries, lane markings, barriers). markings, obstacles). Therefore, its main advantage is to react to the dynamic evolution of the scene representation. The movement of the personal vehicle is then guided by the resulting force vectors, so there is no explicit spatial force vector guidance, so no explicit spatial decomposition is required. Common methods include using Artificial Potential Field (APF), Velocity Vector Field (VVF). and elastic band algorithm.
insert image description here

4.4 Parametric and semiparametric curves

Parametric and semi-parametric curves are the main geometric methods of path planning algorithms on highways for at least two reasons:

(1) The road of the highway is constructed by a series of simple and predefined curves (lines, circles, and sticks [105]); (2) The
predefined set of curves is easy to implement and tested as a set of candidate solutions.

…For details, please refer to Gu Yueju

Guess you like

Origin blog.csdn.net/lovely_yoshino/article/details/128687937