Artificial Intelligence Experiment - Adversarial Search Game (Pac-Man)

Artificial Intelligence Experiment - Adversarial Search Game (Pac-Man)

The experimental content mainly includes three aspects, improving the Reflex agent, designing the Minimax agent, and designing the Alpha-Beta agent . The following are my experimental steps and process. (The code modified or added in the experiment is located in "multiAgents.py" in the compressed package). This blog post mainly talks about the theory, and does not include the implementation code. If you need the code, you can visit my personal homepage to download it, or you can private message me. .

Statistics program interface

Since this experiment is modified and added on the basis of existing programs, it is more important to count some necessary interfaces. The following are several important interfaces, and their detailed descriptions are shown in the figure. The following gameState is an object corresponding to the GameState class in the program.

  1. gameState.getLegalActions(index)
  2. gameState.getPacmanPosition()
  3. gameState.getFood()
  4. gameState.getGhostPositions()
  5. ghostState.scaredTimer
  6. gameState.getCapsules()
  7. gameState.getWalls()
  8. gameState.generateSuccessor(agentIndex, action)

  9. The detailed description of the above nine interfaces of gameState.getNumAgents() is shown in the figure below

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directlyinsert image description here

ReflexAgent agent modification design

I modified ReflexAgent by using the combination of BFS algorithm and greedy algorithm. The following is the specific modification idea.

  • First create a BFS search function that receives the position parameter of Dou-Man. Then use this parameter to obtain the coordinates corresponding to the legal path, and enqueue these coordinates in the queue data structure. Set a loop statement, and its judgment statement is to exit when there is a bean at the coordinate position. If there is no bean at the position in this loop statement, the coordinates of the legal movement path at this position are obtained and these coordinates are enqueued, and then the previous coordinates are dequeued to enter the next cycle. .The flow chart is shown in the figure.

insert image description here

  • The greedy algorithm guides the direction of progress, and uses the coordinates obtained in the above BFS to calculate the distance from the coordinates corresponding to each legal movement operation to this point, and use this as a score.

  • In addition to being oriented, also consider the distance from the ghost. So here we use the Manhattan distance between Pac-Man and the ghost to avoid. When the distance from the ghost is less than or equal to 1, the score of negative infinity is returned and the production behavior is generated in time.

    for i in ghosstP:
        if abs(newPos[0]-i[0]) + abs(newPos[1]-i[1]) <= 1:
            return -999999999
    
  • In ReflexAgent, it was found that in some complex diagrams, the stop action will make the ghost stagnate, so here the score of the stop action is changed to only greater than the score of encountering the ghost.

  • In addition to the regular guidance of Pac-Man, I also designed in ReflexAgent to allow Pac-Man to eat the capsule first when he is close to the capsule (the Manhattan distance from the capsule is less than 5) to get a higher score.

    for i in capsules:
        if abs(newPos[0] - i[0]) + abs(newPos[1] - i[1]) <= 3:
            if action == "Stop":
                return -5000
            return (5-(abs(newPos[0] - i[0]) + abs(newPos[1] - i[1]))
    
  • RelexAgent test results are shown in the figure

    Use for python pacman.py -p ReflexAgent -l openClassic -n 10 -q10 consecutive tests and its final 10 wins, with an average score of 1305.4

    insert image description here

Minimax Agent Design

For the design of the Minimax agent, there are many pitfalls. At the beginning, the code was designed completely according to the idea of ​​the courseware, and the node class and operation class were designed. After everything is designed, Minimax can run successfully with a success rate of 50 or 60%. However, when designing the Alpha-Beta algorithm, it is found that the code of Minimax actually searches for Pac-Man at each layer, and then based on the code Very complex and difficult to modify. So in the end, it was redesigned with reference to other people's code and the pseudocode on PPT.
The referenced articles are:

[1]https://blog.csdn.net/Pericles_HAT/article/details/116901139
[2]https://blog.csdn.net/m0_48134027/article/details/120340931

  • Max_value function design
    First, we need to design a max_value() function to evaluate the max layer. In the design of the function, recursion is carried out together with the min_value function. The end of the recursion is the end of the game or reaching a given depth. When the recursive cut-off point has not been reached, it is equivalent to a top-down search, and the backtracking after reaching it is equivalent to a bottom-up scoring. Its search is somewhat similar to the search of DFS, and the search continues until the limit is reached. When scoring, first use its own evaluate function, and then compare and evaluate according to the attributes of min or max when going to the upper layer. The figure below is the flow chart of the max_value function.

    insert image description here

  • Design of Min_value function
    Min_value function is very similar to max_value, the main difference lies in the initialization of v value, which is initialized to positive infinity in min. Also, the min function does not need to remove the stop action. The obvious difference is that the min function needs to judge whether it has reached the last ghost to decide whether to call the min function or the max function. And its comparison is to take the smallest. The following figure is the flow chart of the min function.

    insert image description here

  • Finally, as long as the above two functions call the max function first, the two functions can recurse each other to get the result. The following is the test result graph:
    python pacman.py -p MinimaxAgent -l minimaxClassic -a depth=4 -q -n 100
    in the case of a depth of 4, the winning rate is 62% after testing 100 times.

    insert image description here

Alpha-Beta agent design

The Alpha-Beta agent I designed is pruned on the basis of Minimax, and its code part is very close to minimax, except that two global variables, alpha and beta, are added for pruning. In the max function, each traversal action will determine whether the v value is greater than the beta value, and if it is greater, it will directly return v without the following loop traversal operation. In addition to truncation, each cycle will check whether the alpha value needs to be updated, and each time will take the largest of v and the original alpha as the new alpha. The operation in the min function is also similar, just swap the beta value with the alpha value, and then swap greater than and less than. The following two figures are the alpha-beta pruning diagrams of max and min respectively.

insert image description here

After entering the code python pacman.py -p AlphaBetaAgent -a depth=3 -l smallClassic
and running, you can obviously feel that the running speed has been greatly improved, although it is difficult to win.

おすすめ

転載: blog.csdn.net/weixin_51735061/article/details/125743795