Generative Agents: Interactive Simulacra of Human Behavior

Generative Agents: Interactive Simulacra of Human Behavior

Introduction

The paper was jointly published by Stanford University and the deepmind team. It mainly introduces a method to use the LLM model to guide the generation of Agents, so that the agents have the ability to remember, reflect and plan.
Combined with the LLM model, a system architecture is created that can use natural language to remember the historical experience of agents, synthesize memory to generate higher-order reflections, and use reflection and memory to plan actions.
For example, in the experiment, a user wanted to hold a Valentine's Day party. The agents automatically spread the word, met new friends, invited friends to the party, and appeared in the queue on time.
This paper introduces a new architecture and interaction model that enables trustworthy simulation of human behavior.
Innovation: Use pure text to record the user's historical behavior, combined with the prompt ability of the large model, so that agents have the ability to reflect and plan.

Intelligent agent interaction architecture

Insert image description here

  1. GA accepts the current environment and past experience as input, and then generates behavior as output.

memory and retrieval module

Challenge: Because GA has too many historical behaviors, it is unrealistic to put them all into LLM for reasoning.

Insert image description here
Divided into several model parts, memory and memory retrieval modules, memory is recorded in understandable text, including natural language descriptions and timestamps.

Reflection module

Challenge: The agent only has original observation memory, and it is difficult to reason and generalize. For example, if you ask Klaus, if you choose someone to spend an hour with, who would you choose? If you use original memory, he will choose Wolfgang, who has had more contact with him, instead of Maria, who has common interests and research directions.
Insert image description here
Method: Through the method of reflective induction, the agent has a higher-order reflective memory, which allows the user to choose Maria with more similar interests. Reflection is when the agent automatically reflects on a regular basis with the help of the LLM model.

planning and action

Challenge: While leveraging the LLM model will generate actions that appear reasonable, it will take a lot of time and generate errors, such as eating lunch multiple times.
Method: Use the language model to generate memory planning, using a top-down and recursive approach to generate more details.
First, use the summary description of the agent and the historical behavior of the past few days to generate an initialization plan. Then generate an action sketch for the day. Save the action plan to memory and then recurse to generate more details. First generate the hour level, then generate the minute level.
To act and update the plan, the agent first takes action, observes the surrounding environment, and then records the environmental information into the memory stream.
If the action occurs as an interaction between agents, a session is generated.

Sandbox environment settings

Represent the environment as text, use a tree structure to represent the ownership relationship between locations and items, and use the understanding ability of the LLM model to understand the scene.

communication emergent ability

Insert image description here
After giving information about a Valentine's Day party to an agent, the model disseminated information to the users it met, and carried out secondary information dissemination. In the end, some users who received the information attended the party on time, which illustrates the intelligence The body has certain abilities of information understanding, memory, planning, and action.

Summarize

The paper introduces a method to generate intelligent agents, so that the agent and the environment have the ability to interact and communicate, and have the emergence phenomenon of social actions.
The innovation is to completely record the environment where the agent is located in the form of text and structured text, as well as record the historical behavior information of the agent. Let the agent combine the ability of large language models to have the ability of memory, reflection and action planning.

Guess you like

Origin blog.csdn.net/WitsMakeMen/article/details/132986527