Computer game algorithm (Adversarial Search)

I. Introduction

Human-computer game is an important branch of artificial intelligence. People have produced a lot of research results in the process of exploring this field, and the minimax algorithm (minimax) is the most basic algorithm. It was formally established by Shannon in 1950. propose. The essence of Alpha-beta pruning is an improved method based on the minimax algorithm. Knuth et al. optimized the algorithm in 1975 and proposed the concept of negative maximum value (negamax) . The principle of this concept is essentially the same as the minimization maximum value algorithm, but it does not require the system to distinguish the maximum value. and the minimum, making the algorithm more uniform. In addition, Knuth and others also conducted in-depth research on the search efficiency of the alpha-beta pruning algorithm, and Pearl also proved the optimality of the alpha-beta pruning principle in 1982 .

Second, the maximum and minimum algorithm ( Minimax Search )

1. Minimax algorithm

In the man-machine game, the two sides play chess in a turn-based manner, and one side considers the moves that the opponent may take after making a certain choice among all feasible moves, so as to choose the most beneficial move for itself. This game process constitutes a game tree. The two parties continuously search in the game tree and choose the child node that is most beneficial to them to play chess. In the process of searching, the side that takes the maximum value is called max, and the side that takes the minimum value is called min. max always chooses the child node with the greatest value to play chess, while min does the opposite. This is the core idea of ​​the minimax algorithm.

  1. If the node is a terminal node: apply the evaluation function to evaluate;

  1. If the node is a max node: find the value of each child node, and use the largest child node value as the value of the node;

  1. If the node is a min node: find the value of each child node, and use the smallest child node value as the value of the node.

2. Valuation function

The evaluation function is used to give an evaluation for each situation, and to judge the current situation in the game tree. In traditional board game intelligence systems, the evaluation function is generally designated artificially, which plays a decisive role in the level of board game intelligence.

The form of the evaluation function is not fixed, its input is generally the information of a situation, and the output is a value indicating the degree of good or bad of the corresponding situation. In order to illustrate the minimax algorithm, for example, the evaluation function of tic-tac-toe is stipulated as: the number of possible rows, columns, and oblique lines for player X minus the number of possible rows, columns, and oblique lines for player O .

3. a-b pruning search

3.1. Alpha-beta pruning principle

The biggest disadvantage of the minimax algorithm is that it will cause data redundancy, and this redundancy has two situations: ① maximum value redundancy; ② minimum value redundancy. Correspondingly, alpha pruning is used to solve the maximum value redundancy problem, and beta pruning is used to solve the minimum value redundancy problem , which constitutes a complete Alpha-beta pruning algorithm. Next, we briefly introduce the maxima and minima redundancy and the specific pruning process.

  • Alpha pruning: maximum value redundancy As shown in the figure, this is a certain part of a game tree, the data under the node is the value of the node, the value of node B is 20, and the value of node D is 15. Here, C is the min node that takes the minimum value, so the value of node C will be less than or equal to 15; and node A is the node that takes the maximum value max, so A can only get the value of B, that is, it is no longer necessary to search for C The value of other child nodes E and F can get the value of node A. In this way, subtracting the successor sibling nodes of node D is called Alpha pruning .

  • Beta pruning: minimum value redundancy As shown in the figure, this is also a part of a game tree. The value of node B is 10, and the value of node D is 19. Here, node C is the max node. Therefore, the value of C will be greater than or equal to 19; node A is a min node with a minimum value, so the value of A can only take the value of B 10, that is to say, it is no longer necessary to ask for the values ​​of child nodes E and F of node C The value of node A can be derived. In this way, subtracting the successor sibling nodes of node D is called Beta pruning .

3.2. α-β pruning implementation

  • alpha : It will be updated in the MAX round to record the maximum value of each child node of the current node. If the child node is pruned, it will be the maximum value after the pruned part is thrown away.

  • beta: 在MIN轮次会被更新,用来记录当前节点的各个子节点中的最小值,如果子节点被剪枝了,那就是抛去被裁剪部分之后的最小值。

  • 剪枝条件:α>=β

  • 初始化:是递归调用,每一个节点的alpha的初始值均是负无穷,因为alpha要负责记录最大值;每一个节点的beta的初始值均是正无穷,因为beta要负责记录最小值。

最清晰易懂的MinMax算法和Alpha-Beta剪枝详解

3.3 α-β search

  • The α-value of a MAX-node is set to the current largest final backed-up value of itssuccessors. That is, you can not back up a node until you have finished looking at itschildren.

  • The β-value of a MIN-node is set to the current smallest final backed-up value of itssuccessors.

  • α cut-off – search is discontinued below a MIN-node whose β value is less than or equal to the α value of any of its MAX-node ancestors.

  • β cut-off – search is discontinued below a MAX-node whose α value is greater than or equal to the β value of any of its MIN-node ancestors.

References

https://www.w3cschoool.com/adversarial-search

Alpha-beta剪枝 -机器之心

Minimax Search以及alpha-beta剪枝

Guess you like

Origin blog.csdn.net/m0_64768308/article/details/129472644