Detailed explanation of A* algorithm (a great one)

Overview

Although people who have mastered the A* algorithm think it is easy, for beginners, the A* algorithm is still very complicated.

The Search Area

Let's suppose someone wants to move from point A to point B, but the two points are separated by a wall. As shown in Figure 1, the green is A, the red is B, and the middle blue is the wall.
Write picture description here
figure 1

You should have noticed that we divided the search area into square grids. This is the first step in pathfinding, simplifying the search area, just like we did here. This particular method reduces our search area to a 2-dimensional array. Each item of the array represents a grid, and its state is walkable (walkalbe) and unwalkable (unwalkable). By calculating which squares need to be walked from A to B, the path is found. Once the path is found, the character moves from the center of one square to the center of another square until reaching the destination.

The center point of the square is called "nodes". If you read other articles about A* pathfinding algorithms, you will find that people are often discussing nodes. Why not just describe it as a square? Because it is possible for us to divide the search area into other multiple deformations instead of squares, for example, it can be hexagonal, rectangular, or even arbitrary multiple deformations. The nodes can be placed in any polygon, they can be placed in the center of multiple deformations, or on the edges of the polygon. We use this system because it is the simplest.

Once we reduce the search area to a set of quantifiable nodes, as we did above, the next step we need to do is to find the shortest path. In A*, we start from the starting point, check the adjacent squares, and then expand around until we find the target.

We begin our pathfinding journey like this:

  1. Start from the starting point A and add it to an open list consisting of squares (open list). This open list is a bit like a shopping list. Of course, there is only one item in the open list, which is the starting point A, and more items will be added later. The grid in the Open list indicates that the path may or may not pass along the way. Basically the open list is a checkered list.

  2. Check the squares adjacent to the starting point A (ignore the squares occupied by the walls, the squares occupied by the river and the squares occupied by other illegal terrain), and select the walkable or reachable among them The square is also added to the open list. Set the starting point A to the parent (parent node or parent square) of these squares. When we are tracking the path, the content of these parent nodes is very important. I will explain later.

  3. Remove A from the open list and add it to the close list (closed list). Every square in the close list is no longer needed.

As shown in the figure below, the dark green square is the starting point, and its outer frame is bright blue, indicating that the square has been added to the close list. The black squares adjacent to it need to be checked, and their frame is bright green. Each black square has a gray pointer pointing to their parent node, here is the starting point A.
Write picture description here
figure 2 .

Next, we need to select a square adjacent to the starting point A from the open list, and repeat the previous steps more or less as described below. But which square to choose? The one with the smallest F value.

Path Sorting

The key to calculating the squares that make up the path is the following equation:

F = G + H

Here,

G = The cost of moving from the starting point A to the specified square, along the path generated to reach the square.

H = The estimated cost of moving from the specified square to the end point B. This is often referred to as heuristics, which is a bit confusing. Why is it called that, because it is a guess. We won't know the true distance until we find the path, because there are various things (such as walls, water, etc.) along the way. This tutorial will teach you a method to calculate H, you can also find other methods on the Internet.

Our path is generated like this: repeatedly traversing the open list and selecting the square with the smallest F value. This process is described in detail later. Let us first look at how to calculate the above equation.

As mentioned above, G is the cost of moving from the starting point A to the specified square. In this example, the horizontal and vertical movement cost is 10, and the diagonal movement cost is 14. These data are used because the actual diagonal movement distance is the square root of 2, or an approximate 1.414 times the horizontal or vertical movement cost. 10 and 14 are used for simplicity. The ratio is right, we avoid open and decimal calculations. It's not that we don't have this ability or don't like math. Using these numbers can also make the computer faster. You will find out later that if you do not use these techniques, the pathfinding algorithm will be very slow.

Since we calculate the G value along the path to the specified square, the way to calculate the G value of the square is to find the G value of its father, and then add 10 or according to whether the father is a straight line or a diagonal direction 14. As we leave the starting point and get more squares, this method will become clearer.

There are many ways to estimate the value of H. Here we use the Manhattan method to calculate the number of squares from the current square horizontally or vertically to the target, ignoring the diagonal movement, and then multiply the total by 10. It's called the Manhattan method because it's a lot like counting the number of blocks you cross from one place to another, and you can't cross the blocks diagonally. It is important to calculate H to ignore obstacles in the path. This is an estimate of the remaining distance, not an actual value, so it is called heuristics.

Add G and H to get F. The result of our first step is shown in the figure below. Each square is marked with the value of F, G, H, just like the square to the right of the starting point, the upper left corner is F, the lower left corner is G, and the lower right corner is H.
Write picture description here
image 3

Okay, now let's look at some of the squares. In the box marked with letters, G = 10. This is because there is only one square from the starting point in the horizontal direction. The upper, lower, and left squares directly adjacent to the starting point have G values ​​of 10, and the diagonal squares have G values ​​of 14.

The H value is obtained by estimating the Manhattan distance from the start point to the end point (red square), only moving horizontally and vertically, and ignoring the walls along the way. In this way, there is a distance of 3 squares from the square to the right of the start point to the end point, so H = 30. There are 4 squares from the square above this square to the end point (note that only the horizontal and vertical distances are calculated), so H = 40. For other squares, you can use the same method to know how the H value is derived.

The F value of each square, again, just add the G value and the H value directly.

In order to continue searching, we select the (square) node with the smallest F value from the open list, and then perform the following operations on the selected square:

  1. Take it out of the open list and put it in the close list.

  2. Check all the squares adjacent to it, ignore the squares in the close list or unwalkable (such as walls, water, or other illegal terrain), if the squares are not in the open lsit, put them Join the open list.

Set our selected square as the father of these newly added squares.

  1. If an adjacent square is already in the open list, check whether this path is better, that is to say, whether to reach that square through the current square (the square we selected) has a smaller G value. If not, do nothing.

On the contrary, if the G value is smaller, set the father of that square to the current square (the square we selected), and then recalculate the F and G values ​​of that square. If you are still confused, please refer to the picture below.
Write picture description here
Figure 4

Ok, let's see how it works. Of our first 9 squares, 8 more are in the open list, and the starting point is placed in the close list. Among these squares, the square to the right of the starting point has the smallest F value of 40, so we choose this square as the next square to be processed. Its frame is highlighted with blue lines.

First, we move it from the open list to the close list (this is why it is highlighted with a blue line). Then we check the square adjacent to it. The square to the right of it is the wall, which we ignore. The square to the left of it is the starting point, which is also ignored in the close list. The other 4 adjacent squares are in the open list. We need to check whether the path to there through this square is better, and use the G value to determine. Let's look at the square above. It now has a G value of 14. If we get there via the current square, the G value will be 20 (where 10 is the G value to reach the current square, plus the G value 10 that moves from the current square to the upper square). Obviously 20 is greater than 14, so this is not the optimal path. If you look at the picture, you will understand. It is better to move diagonally from the starting point to that square than to move it horizontally and then vertically.

After checking all four adjacent squares in the open list, no better path through the current square is found, so we do not make any changes. Now that we have checked all the adjacent squares of the current square and processed them, it is time to select the next square to be processed.

So we traverse our open list again, and now it only has 7 squares, we need to choose the one with the smallest F value. What is interesting is that this time there are two squares with F values ​​of 54. Which one should you choose? It doesn't matter. In terms of speed, it is faster to choose the square that is added to the open list last. This leads to the preference of using newly found squares first when approaching the target during the pathfinding process. But this is not important. (Different treatment of the same data causes the two versions of A* to find different paths of equal length).

We select the square at the bottom right of the starting point, as shown in the figure below.
Write picture description here
Figure 5

This time, when we check the adjacent squares, we find that the square to the right is a wall, so ignore it. The same goes for the above.

We also ignore a square under the wall. why? Because if you don't cross the corner, you can't directly move from the current square to that square. You need to go down first, and then move to that square to get around the corner. (Note: The rule of crossing the corner is optional and depends on how your nodes are placed)

This leaves 5 adjacent squares. The two squares below the current square have not been added to the open list, so add them and set the current square as their father. Of the remaining 3 squares, 2 are already in the close list (one is the starting point, the other is the square above the current square, the outer frame is highlighted), we ignore them. The last square, which is the square to the left of the current square, we check to see if there is a smaller G value to reach there through the current square. No. So we are ready to select the next square to be processed from the open list.

Repeat this process until the end point is also added to the open list, as shown in the figure below.
Write picture description here
Image 6

Note that the father of the square 2 squares below the starting point is different from the previous one. Previously, its G value was 28 and it pointed to the square on the upper right. Now its G value is 20 and it points to the square directly above it. This happens somewhere in the pathfinding process, when the new path is used, the G value is checked and becomes lower, so the parent node is reset, and the G and F values ​​are recalculated. Although this change is not important in this example, in many cases, this change will cause a huge change in the pathfinding results.

So how do we determine the actual path? It's very simple. Start from the end, press the arrow to move to the parent node, so that you are taken back to the starting point, this is your path. As shown below. Moving from the start point A to the end point B is simply to move from the center of one square on the path to the center of another square to the goal. It's that simple!
Write picture description here
Figure 7

A* algorithm summary (Summary of the A* Method)

Ok, now you have read the entire introduction, now we put all the steps together:

  1. Add the starting point to the open list.

  2. Repeat the following process:

a. Traverse the open list, find the node with the smallest F value, and use it as the current node to be processed.

b. Move this node to the close list.

c. For each of the 8 adjacent squares of the current square?

◆ If it is unreachable or it is in the close list, ignore it. Otherwise, do the following.

◆ If it is not in the open list, add it to the open list, and set the current square as its parent, and record the F, G and H values ​​of the square.

◆ If it is already in the open list, check whether this path (that is, to reach it through the current grid) is better, and use the G value as a reference. A smaller G value indicates that this is a better path. If so, set its father to the current square, and recalculate its G and F values. If your open list is sorted by F value, you may need to re-sort after the change.

d. Stop when you

◆ Add the end point to the open list, at this time the path has been found, or

◆ Failed to find the end point, and the open list is empty, there is no path at this time.

3. Save the path. From the end point, each square moves along the parent node to the starting point, this is your path.

Digression (Small Rant)

Please forgive my digression. When you see various discussions about A* algorithms on the Internet or on forums, you will occasionally find some A* codes, but they are not. To use A*, you must include all the elements discussed above—especially open list, close list, and path costs G, H, and F. There are many other pathfinding algorithms, these algorithms are not A* algorithms, A* is considered the best. In some of the articles cited at the end of this article, Bryan Stout discusses some of them, including their advantages and disadvantages. At some point you can choose one of the two, but you must understand what you are doing.

Guess you like

Origin blog.csdn.net/qq_41371349/article/details/107501179