【Gomoku Actual Combat】Chapter 2 Game Tree Negative Value Extremely Alpha-beta Pruning Algorithm

  The most commonly used backgammon algorithm on the market is the game tree maximum and minimum value alpha-beta pruning algorithm . This algorithm can be divided into four parts to explain, and they are interlocking: Game tree - maximum and minimum value search - Negative Maximization - alpha&beta pruning .


game tree

  Game Tree (Game Tree) is a concept in game theory, which is used to represent various possible moves and corresponding results in the game process. It is a tree structure, each node of the tree represents a state of the game, and the child nodes of each node represent possible next actions in that state.

Because it is a tree structure, recursion will be used in code implementation, and it can be said to be a traceback method when it is subdivided.

  In the game tree, the root node represents the initial state of the game, while the leaf nodes represent the end state of the game. By traversing the game tree, one can find the best strategy of action or evaluate the outcome of the game.

  For a two-player game, each level of the game tree alternately represents the actions of different players. In the backgammon game, when it is a party's turn to act, a layer of sub-nodes will be generated, and each sub-node represents the position of the next move. For each drop position, continue to generate the next layer of child nodes, and so on, until the end of the game.

But it is usually impossible to analogize until the end of the game, and the calculation time for taking 3 steps is very, very long.

  For example, suppose there is a 2*2 backgammon board, the sunspots go first, and they are placed in the upper left corner, as shown in the figure below:

Please add a picture description

  Then the game tree is constructed as shown in the figure above. The root node is the current chess game. In the first step, white pieces play chess, and there are 3 vacancies left to play; in the second step, it is black’s turn to play chess, and there are 2 vacancies left; the cycle continues...

  We found that for a 2x2 board, the game tree has a total of 3x2x1 paths. For a normal 15x15 board, the game tree will have 224 levels and 224! kinds of paths. So how can we determine which path is the best? This uses the idea of ​​maxima and minima.


Minimax Search Minimax

  The idea of ​​maxima and minima is a commonly used method in the field of games and artificial intelligence. In a game, a maxima-minimum search algorithm (such as the Minimax algorithm) can be used to determine the best next move. The algorithm works by recursively simulating all possible game states and opponent responses, and then chooses the most beneficial action for itself.

  Before executing the maxima and minima search algorithm, we need to score each state of the chessboard, and then use the maxima and minima search algorithm to compare which state of the chessboard is optimal.

  Scoring the Board - Scoring Criteria

  First, a scoring criterion is required . For backgammon, that is how many points we will count if there is a chess pattern like "blank, white, and empty", and how many points we will count if there is a chess pattern like "white, black, black, black, and empty". The scoring criteria designed in the project are as follows, and do not distinguish between black and white pieces, 1 means there are pieces, and 0 means empty pieces:

# 棋型的评估分数
shape_score = [
    (50, (0, 1, 1, 0, 0)),
    (50, (0, 0, 1, 1, 0)),
    (200, (1, 1, 0, 1, 0)),
    (500, (0, 0, 1, 1, 1)),
    (500, (1, 1, 1, 0, 0)),
    (5000, (0, 1, 1, 1, 0)),
    (5000, (0, 1, 0, 1, 1, 0)),
    (5000, (0, 1, 1, 0, 1, 0)),
    (5000, (1, 1, 1, 0, 1)),
    (5000, (1, 1, 0, 1, 1)),
    (5000, (1, 0, 1, 1, 1)),
    (5000, (1, 1, 1, 1, 0)),
    (5000, (0, 1, 1, 1, 1)),
    (500000, (0, 1, 1, 1, 1, 0)),
    (99999999, (1, 1, 1, 1, 1))
]

The design of this scoring standard directly affects the effect of the algorithm, which is very important. Moreover, the scoring standard here is not optimal, but the effect is not bad. Readers are advised to add and modify it to try.

  Scoring the Board - How to Accumulate Scores

  Cumulative scores are also a big question.

  The method of this project is, for each point A where white pieces fall, to traverse the four directions of A’s horizontal, vertical, left oblique, and right oblique whether there is a shape of white pieces falling into the scoring standard, and if so, record the largest Score values ​​and corresponding shapes. If this point A has been recorded in one of the four directions, it will not be included in the calculation. Finally, find out if there are any intersecting points from the recorded array, and if there are, the score will be doubled.

  Sunspots are calculated in the same way.

  The score of the final board = the total score of our color pieces - the total score of the enemy pieces * weight.

So in fact, both white and sunspots are involved in the calculation. The weight reflects the aggressiveness of the computer and is configurable.

  idea of ​​maxima

  The purpose of the idea of ​​maximum and minimum value is to find the optimal path of the game tree.

Please add a picture description

  The idea of ​​maximum and minimum value considers the minimum profit or maximum loss that we can guarantee when the opponent also adopts the optimal strategy.

  In the above figure, assuming that the scores of the last layer are 1, 2, 3, 4, 5, 6, then their upper layer is our side, and we take the one that is beneficial to us (take max), so the third layer is still 1, 2, 3, 4, 5, 6, the second floor is the enemy, he will take the score that is most unfavorable to us (take min), so it is 1, 3, 5, the first floor is our side, take max=5.

  To sum up, the current sunspots have been placed, and the white stones have searched for the coordinates of the node whose node value is 5 on the second layer.

Because it is backtracking, the tree must be viewed from the bottom up.


Negamax

  For the enemy, acquisition is the smallest. But if the enemy is also allowed to take the maximum, but a minus sign is added at the end, isn't it also equivalent to taking the minimum? ? And this will simplify our code, making our code less and more elegant. So the negative value maximization method was born.

In fact, it makes the code as abstract and difficult to understand as a bible. . . .

  Negamax is a game algorithm based on the minimization search tree. It is based on a fundamental observation: For a player, the best strategy is equivalent to the opponent's worst strategy. Therefore, by negating the opponent's payoff, the problem can be transformed into a maximization search problem.

  The two algorithms are similar in nature, but expressed differently. In the Negamax algorithm, the problem is transformed into a maximization search problem by taking a negative value of the opponent's profit; while in the Minimax algorithm, the optimal decision is searched in the game tree by alternating maximization and minimization.

Please add a picture description

  The power of the AI ​​algorithm is that you will recurse down a few steps. A 15x15 chessboard can recurse 224 steps, and there are 224! kinds of paths, which means you have to backtrack from the 224th layer to the first layer. It is estimated that only supercomputers can do it up.

  Usually we have 224x223x222=1108,9344 leaf nodes for 3 layers of recursion. So, smart people optimized it, which is alpha-beta pruning search.


alpha-beta pruning

  Alpha-beta pruning is a technique used to optimize the minimax search algorithm, which can effectively reduce the number of searched nodes, thereby improving search efficiency. Alpha-beta pruning algorithm introduces two parameters based on Minimax search: alpha and beta.

  During the Minimax search, each player seeks to maximize his own payoff or minimize his opponent's payoff. During the search process, when the value of a certain node exceeds the range from alpha to beta, pruning operations can be used to reduce the search for the node and its child nodes.

  Specifically, when a node is searched, its child nodes will be searched first, and it will proceed recursively. After the recursion returns, alpha and beta are updated according to whether the current player is the maximizing or minimizing player. If the value of alpha is greater than or equal to the value of beta, the search of the current node can end early, and the rest of the child nodes will not be searched, because they will not affect the current player's best choice.

  By continuously updating the values ​​of alpha and beta, the number of nodes to be searched can be dynamically reduced during the search process, thereby greatly improving search efficiency. The key to the Alpha-beta pruning algorithm is to correctly maintain the values ​​of alpha and beta, and traverse the child nodes in the correct order to ensure the correctness of the pruning operation.

Please add a picture description
  Especially important - the pruning process

  As shown in the figure above, α is the known maximum value, and β is the known minimum value, because I don’t know how much it is because I haven’t searched it yet. To be on the safe side, initialize it to -∞ and +∞.

  1 => First of all, depth-first traversal will traverse to the fourth layer. The value of the first node is 1, and its parent node belongs to the Max layer. Then set α to 1, indicating that this parent node only receives values ​​>=1.

  2 => Then it goes back to the first node of the second layer. Since their α and β values ​​are interchanged for each layer returned , this node says that it only receives values ​​<=1.

  3 => 接着遍历到第四层第二个节点值为2,它的父节点属于Max层,那么把α置为2,说明这个父节点只接收>=2的值。此时问题来了,第三层第二个节点>=2,第二层第一个节点<=1,产生冲突,把第三层第二个节点为根的子树剪掉,因为第三层第二个节点>=2表示我这个子树肯定是>=2了,但是它的父节点只要<=1的,那还要你干嘛?

  4 => 那么回溯到根节点,根节点现在变成>=1的了。

  5 => 接着遍历到第四层第三个节点值为3,它的父节点属于Max层,那么把α置为3,说明这个父节点只接收>=3的值。

  6 => 此时回溯到了第二层第二个节点,由于每返回一层它们的α、β值是互换的,所以此时这个节点说它只接收<=3的值。

  7 => 接着遍历到第四层第四个节点值为2,它的父节点属于Max层,那么把α置为2,说明这个父节点只接收>=2的值。而它的父节点是>=3,所以不剪枝,同时回溯,覆盖父节点的范围,变成<=2。

  8 => 那么又回溯到根节点,根节点之前是>=1的,现在被覆盖,变为>=2。

  9 => ……其他同理……

因为我图画的简陋,所以看不出来剪枝的妙处。但是可以想象一下上图被剪枝的部分,如果它们节点有很多,那么除了左子树,它们的右子树都不会再需要遍历了。

  值得注意的是,Alpha-beta剪枝算法适用于满足最优化原则的博弈问题,即一个玩家的收益等于另一个玩家的损失。它能够在保证找到最优解决方案的前提下,大大减少搜索的节点数,提高搜索效率。然而,如果搜索树的分支因子很高,或者搜索深度很大,仍然可能需要相当大的计算资源来完成搜索。因此,在实际应用中,仍然需要结合启发式评估函数等其他优化技术来进一步提高效率。


博弈树负值极大alpha-beta剪枝算法代码实现

## 初始化输入、确定输出

  首先明确算法的输入、输出,这其实是一个试错的过程,可能要到整体项目做的差不多才能确定,我是已经做完了,所以就直接贴出来了。

def ai(player, ratio, length, board, depth):
    """
    AI入口
    :param player: 当前玩家. 初始化ai与human的棋子列表.
    :param ratio: 进攻系数.
    :param length: 棋盘边长.
    :param board: 棋盘值. 1表示白子,-1表示黑子,0表示空.
    :param depth: 搜索深度.
    :return: 落子位置x, 落子位置y, 搜索次数.
    """
    # 设置全局变量
    global list_ai, list_human, list_ai_add_human, list_all, x, y, search_count, DEP, LEN, RAT
    # 初始化
    DEP, LEN, RAT = depth, length, ratio
    list_ai, list_human, list_ai_add_human, list_all = init_list(player, board)
    # 设置alpha和beta的初始值
    alpha = -99999999
    beta = 99999999
    # 回溯搜索
    backtrack(True, depth, alpha, beta)
    list_ai.append((x, y))
    return x, y, search_count, is_win(list_ai)

  该函数被命名为ai,也就是电脑的意思,接受以下参数:

  player: 当前玩家,用于初始化 AI 和人类玩家的棋子列表。

因为传入的只有棋盘值,棋盘值有1和-1,那电脑到底应该把1作为自己的棋子,还是-1呢?这就需要用到参数player,如果player == 'black',表示电脑是黑子,那么电脑会把-1作为自己的棋子,1作为敌方棋子。

  ratio: 进攻系数,可能是用于调整 AI 对进攻和防守的权衡。

进攻系数就是上面说的权重。

  length: 棋盘的边长。
  board: 表示棋盘状态的二维数组,其中1表示白子,-1表示黑子,0表示空。
  depth: 搜索的深度,决定了 AI 进行决策时回溯的层数。

搜索的深度就是所谓的“能看几步”,我电脑目前看3步就要算个40来秒,着实不太行了。

  该函数返回落子的位置(x, y)、搜索次数和是否获胜。

  在函数内部,使用global关键字声明一些全局变量,包括list_ailist_humanlist_ai_add_humanlist_allxysearch_countDEPLENRAT。分别表示ai的棋子、敌方的棋子、ai的棋子+敌方的棋子,所有的位置坐标、计算出来的x坐标、计算出来的y坐标、算法搜索次数(主要是debug用)、搜索深度、棋盘长度、进攻系数。

这里做成全局变量是因为就不要传参了,不然每个函数的形参都要写一大堆,挺麻烦的。

  接下来,将传入的参数初始化到对应的全局变量中。将 depth 赋值给 DEPlength 赋值给 LENratio 赋值给 RAT。然后调用 init_list 函数,使用 playerboard 参数来初始化 list_ailist_humanlist_ai_add_humanlist_all 这些列表。

  设置初始的alphabeta值,用于进行Alpha-Beta剪枝。

  调用 backtrack 函数进行回溯搜索,传入参数 True 表示当前是 AI 的回合,搜索的深度为 depth,并使用 alphabeta 进行 Alpha-Beta 剪枝。搜索完成后,将得到的落子位置 (x, y) 添加到 list_ai 列表中,然后返回 (x, y)search_count(搜索次数)和调用 is_win 函数判断 list_ai 是否包含获胜的棋型。

## 开始回溯

def backtrack(is_me, depth, alpha, beta):
    """
    博弈树极大极小值alpha-beta剪枝搜索
    :param is_me: 当前是不是轮到自己下棋.
    :param depth: 回溯深度.
    :param alpha: 
    :param beta: 
    :return: 
    """
    # 约束条件:已经有一方获胜 或者 搜索深度为0
    if is_win(list_ai) or is_win(list_human) or depth == 0:
        return evaluation(is_me)

    # 生成当前局面下所有的候选步
    blank_list = list(set(list_all).difference(set(list_ai_add_human)))
    # 启发式搜索, 优先剪枝
    order(blank_list)  # 搜索顺序排序  提高剪枝效率

    # 遍历每一个候选步
    for next_step in blank_list:

        global search_count
        search_count += 1

        # 如果要评估的位置没有相邻的子, 则不去评估  减少计算
        if not has_neightnor(next_step):
            continue

        if is_me:
            list_ai.append(next_step)
        else:
            list_human.append(next_step)

        list_ai_add_human.append(next_step)

        value = -backtrack(not is_me, depth - 1, -beta, -alpha)

        if is_me:
            list_ai.remove(next_step)
        else:
            list_human.remove(next_step)

        list_ai_add_human.remove(next_step)

        if value > alpha:
            if depth == DEP:
                global x, y
                x = next_step[0]
                y = next_step[1]
            # alpha + beta剪枝点
            if value >= beta:
                return beta
            alpha = value
    return alpha

  上面的代码是用博弈树极大极小值算法和alpha-beta剪枝搜索实现五子棋最优点搜索的函数。

  函数名为backtrack,接受四个参数:

  is_me:表示当前轮到的是自己还是对手下棋,是一个布尔值。
  depth:回溯的深度,即搜索树的层数。
  alpha:alpha剪枝的初始值,表示极大值。
  beta:beta剪枝的初始值,表示极小值。

  函数的作用是在博弈树中进行搜索,并返回评估值。

  函数的执行流程如下:

  1. 约束条件检查:如果已经有一方获胜或者搜索深度为0,则返回当前局面的评估值,表示当前局面的好坏程度。

  2. 生成当前局面下所有的候选步:通过计算当前局面中空白位置和已下棋位置的差集,得到所有可下棋的位置。

  3. 启发式搜索:对候选步进行排序,以提高剪枝效率。还对周围没有棋子的点位进行搜索优化,没有的话就不去遍历它。

  4. 遍历每一个候选步:

    对于每个候选步,将其添加到当前下棋方的位置列表中。

    调用递归函数backtrack,以更新搜索树,并返回一个评估值。

    通过alpha-beta剪枝进行优化:

      如果当前轮到自己下棋(is_me为True),更新alpha值为评估值value和alpha的较大值。
      如果轮到对手下棋(is_me为False),更新beta值为评估值value和beta的较小值。
      如果评估值value大于alpha,则更新alpha为value
      如果depth等于最大搜索深度(DEP),记录当前位置xy
      如果评估值value大于等于beta,则进行剪枝,直接返回beta。

  5. 返回alpha作为当前局面的评估值。

## 判赢

  判赢函数有2个用法:一个是在回溯里面作为约束条件,还有一个是在得到计算出的落子之后,判断这样下棋有没有赢。

def is_win(L):
    """
    判断输赢
    :param L: 电脑或玩家的落子列表
    :return: True or False.
    """
    def check_five_in_a_row(start_row, start_col, row_delta, col_delta):
        for i in range(5):
            if (start_row + i * row_delta, start_col + i * col_delta) not in L:
                return False
        return True

    for m in range(LEN):
        for n in range(LEN):
            # 判断横向是否有五子连珠
            if n < LEN - 4 and check_five_in_a_row(m, n, 0, 1):
                return True
            # 判断纵向是否有五子连珠
            if m < LEN - 4 and check_five_in_a_row(m, n, 1, 0):
                return True
            # 判断右上斜向是否有五子连珠
            if m < LEN - 4 and n < LEN - 4 and check_five_in_a_row(m, n, 1, 1):
                return True
            # 判断右下斜向是否有五子连珠
            if m < LEN - 4 and n > 3 and check_five_in_a_row(m, n, 1, -1):
                return True
    return False

  判断输赢函数is_win,接受一个落子列表L作为参数,并返回一个布尔值表示是否有一方获胜。

  函数的执行流程如下:

  1. 定义了一个内部函数check_five_in_a_row,用于检查指定起始位置和方向上是否存在五子连珠。

  · 函数接受四个参数:起始行start_row、起始列start_col、行的增量row_delta和列的增量col_delta

  · 使用循环遍历五个位置,检查每个位置是否存在于落子列表L中,如果有任何一个位置不在列表中,则返回False。

  · 如果所有位置都在列表中,则返回True,表示存在五子连珠。

  2. 使用两层嵌套的循环遍历棋盘上的每个位置,对于每个位置,分别进行以下判断:

   - 判断横向是否有五子连珠:如果当前位置的列小于LEN-4(棋盘边界判断)且调用check_five_in_a_row函数返回True,则表示存在横向五子连珠,返回True。

   - 判断纵向是否有五子连珠:如果当前位置的行小于LEN-4且调用check_five_in_a_row函数返回True,则表示存在纵向五子连珠,返回True。

   - 判断右上斜向是否有五子连珠:如果当前位置的行小于LEN-4且列小于LEN-4且调用check_five_in_a_row函数返回True,则表示存在右上斜向五子连珠,返回True。

   - 判断右下斜向是否有五子连珠:如果当前位置的行小于LEN-4且列大于3且调用check_five_in_a_row函数返回True,则表示存在右下斜向五子连珠,返回True。

  3. 如果在遍历完所有位置后仍未返回True,则表示不存在五子连珠,返回False。

## 评估-计算分数

  计算分数分2个函数,一个是计算整体的分数evaluation,另外一个是针对某个点计算某个方向上的分数cal_score

# 评估函数
def evaluation(is_me):
    if is_me:
        my_list = list_ai
        enemy_list = list_human
    else:
        my_list = list_human
        enemy_list = list_ai

    # 算自己的得分
    score_all_arr = []  # 得分形状的位置 用于计算如果有相交 得分翻倍
    my_score = 0
    for pt in my_list:
        m = pt[0]
        n = pt[1]
        my_score += cal_score(m, n, 0, 1, enemy_list, my_list, score_all_arr)
        my_score += cal_score(m, n, 1, 0, enemy_list, my_list, score_all_arr)
        my_score += cal_score(m, n, 1, 1, enemy_list, my_list, score_all_arr)
        my_score += cal_score(m, n, -1, 1, enemy_list, my_list, score_all_arr)

    #  算敌人的得分, 并减去
    score_all_arr_enemy = []
    enemy_score = 0
    for pt in enemy_list:
        m = pt[0]
        n = pt[1]
        enemy_score += cal_score(m, n, 0, 1, my_list, enemy_list, score_all_arr_enemy)
        enemy_score += cal_score(m, n, 1, 0, my_list, enemy_list, score_all_arr_enemy)
        enemy_score += cal_score(m, n, 1, 1, my_list, enemy_list, score_all_arr_enemy)
        enemy_score += cal_score(m, n, -1, 1, my_list, enemy_list, score_all_arr_enemy)

    total_score = my_score - enemy_score * RAT * 0.1

    return total_score

  上述代码是一个评估函数evaluation,用于评估当前局面的得分。函数根据当前轮到的是自己还是对手,计算出相应的得分。

  函数的执行流程如下:

  1. 根据参数is_me判断当前轮到的是自己还是对手,将对应的落子列表赋值给my_listenemy_list

  2. 初始化变量:

  score_all_arr:用于存储得分形状的位置,用于计算如果有相交则得分翻倍。
  my_score:自己的得分,初始值为0。

  3. 遍历自己的落子列表my_list,对于每个位置(m, n),分别进行以下操作:

  调用cal_score函数计算在水平、垂直、右斜和左斜四个方向上的得分,并累加到my_score中。
  在计算得分的过程中,使用enemy_listmy_list作为参数,以判断对手的棋子是否存在,同时将得分形状的位置存储到score_all_arr中。

  4. 初始化变量:

  score_all_arr_enemy:用于存储对手的得分形状的位置。
  enemy_score:对手的得分,初始值为0。

  5. 遍历对手的落子列表enemy_list,对于每个位置(m, n),分别进行以下操作:

  调用cal_score函数计算对手在水平、垂直、右斜和左斜四个方向上的得分,并累加到enemy_score中。
  在计算得分的过程中,使用my_listenemy_list作为参数,以判断自己的棋子是否存在,同时将得分形状的位置存储到score_all_arr_enemy中。

  6. 计算总得分:

  将自己的得分my_score减去对手的得分enemy_score乘以一个权重因子RAT和0.1的结果,得到总得分total_score

  7. 返回总得分total_score作为当前局面的评估值。

# 每个方向上的分值计算
def cal_score(m, n, x_decrict, y_derice, enemy_list, my_list, score_all_arr):
    add_score = 0  # 加分项
    # 在一个方向上, 只取最大的得分项
    max_score_shape = (0, None)

    # 如果此方向上,该点已经有得分形状,不重复计算
    for item in score_all_arr:
        for pt in item[1]:
            if m == pt[0] and n == pt[1] and x_decrict == item[2][0] and y_derice == item[2][1]:
                return 0

    # 在落子点方向上循环查找得分形状
    for offset in range(-5, 1):
        pos = []
        for i in range(0, 6):
            if (m + (i + offset) * x_decrict, n + (i + offset) * y_derice) in enemy_list:
                pos.append(2)
            elif (m + (i + offset) * x_decrict, n + (i + offset) * y_derice) in my_list:
                pos.append(1)
            else:
                pos.append(0)
        tmp_shap5 = (pos[0], pos[1], pos[2], pos[3], pos[4])
        tmp_shap6 = (pos[0], pos[1], pos[2], pos[3], pos[4], pos[5])

        for (score, shape) in shape_score:
            # 命中一个得分形状
            if tmp_shap5 == shape or tmp_shap6 == shape:
                if score > max_score_shape[0]:
                    max_score_shape = (score, ((m + (0+offset) * x_decrict, n + (0+offset) * y_derice),
                                               (m + (1+offset) * x_decrict, n + (1+offset) * y_derice),
                                               (m + (2+offset) * x_decrict, n + (2+offset) * y_derice),
                                               (m + (3+offset) * x_decrict, n + (3+offset) * y_derice),
                                               (m + (4+offset) * x_decrict, n + (4+offset) * y_derice)), (x_decrict, y_derice))

    # 计算两个形状相交, 如两个3活 相交, 得分增加 一个子的除外
    if max_score_shape[1] is not None:
        for item in score_all_arr:
            for pt1 in item[1]:
                for pt2 in max_score_shape[1]:
                    if pt1 == pt2 and max_score_shape[0] > 10 and item[0] > 10:
                        add_score += item[0] + max_score_shape[0]

        score_all_arr.append(max_score_shape)

    return add_score + max_score_shape[0]

  上述代码是计算在指定方向上的得分函数cal_score,用于评估棋局中某个位置在特定方向上的得分情况。

  函数的执行流程如下:

  1. 初始化变量:
   - add_score:加分项,用于记录在当前方向上的额外得分。
   - max_score_shape:最大得分形状,初始值为(0, None),用于记录在当前方向上的最高得分形状及其分值。

  2. 检查当前方向上是否已经有得分形状,如果有,则不进行重复计算,直接返回0。

  3. 在落子点的方向上循环遍历,查找可能的得分形状。遍历范围为从偏移-5到1,共6个位置。

  4. 对于每个位置,判断该位置在落子列表中的状态:
   - 如果在对手的落子列表中,将状态设置为2。
   - 如果在自己的落子列表中,将状态设置为1。
   - 如果为空位,将状态设置为0。

  5. 根据当前位置及其相邻位置的状态,形成长度为5或6的形状。

  6. 遍历预定义的得分形状和对应的分值:
   - 如果当前形状与预定义的形状匹配,命中得分形状。
   - 如果当前得分高于max_score_shape中记录的最高得分,则更新max_score_shape为当前得分形状。

  7. 判断两个形状是否相交:
   - 如果max_score_shape不为None,表示存在最高得分形状。
   - 遍历已记录的得分形状,检查是否与最高得分形状相交:
   - 如果相交且两个形状的分值都大于10,则将相交的得分形状的分值加到add_score中。

  8. Record the highest scoring shape into score_all_arr.

  9. Return the sum of the score of the bonus item add_scoreand the highest score shape as the total score in the current direction.

  By traversing the positions of the falling points in a specific direction and judging whether the formed shape matches the predefined scoring shape, the score situation in this direction is calculated. At the same time, if there are intersecting scoring shapes, additional scoring is added. Finally returns the total score in the current direction.


Summarize

  aiCall backtracebacktrace.

   backtraceThere are constraints in it: the judgment wins is_win, and whether the depth is 0.

  If the constraints are met, then the calculation of the total game score evaluationwill evaluationbe called cal_scoreto cal_scorecalculate the score of a certain point in a certain direction.

  If the constraints are not satisfied, do some heuristics (pruning), and then recurse.

  Finally, four parameters are returned: x, y coordinates, number of searches, and whether you have won.


Continue to learn the next actual combat!

  【Gomoku Actual Combat】Chapter 3 Algorithm Packaged into a Third-Party Interface

Guess you like

Origin blog.csdn.net/qq_43592352/article/details/131340201