AutoGPT is unreliable, Microsoft launched an upgraded version! Editable autonomous planning process

65e572a76f7ebe042d1728355aaf68d5.png  The original author of Xi Xiaoyao's science and technology
 | iven

AutoGPT [1] , which is popular all over the Internet, has more than 100,000 collections on Github. This self-planning, self-executing agent is the first to focus on self-adjustment and optimization within the artificial intelligence model.

However, many netizens found that the performance of AutoGPT is unstable, and the endless loop is the most common phenomenon. In addition, the execution speed of AutoGPT is very slow. According to the test of netizens, New Bing needs 8s for the task, and AutoGPT took a full 8 minutes!

The way AutoGPT works makes it call the API many times for a single task. According to calculations, the cost of a single task exceeds 100 yuan! Obviously such costs are expensive for personal use.

The new work of Microsoft Research recently proposed Low-code LLM, which collaborates with agents through simple visual operations by dragging and dropping.

1fa61469e834d472e426e1f2897bd362.jpeg

This mode first allows GPT to generate a task flow chart, which is very similar to AutoGPT's self-planning and self-execution logic, but the difference is that users can intuitively and easily understand and modify the entire execution process, thereby effectively controlling artificial intelligence. operate.

It is called "Low-code" because it adopts the concept of visual programming, and users can adjust the process by simply clicking and dragging. For complex tasks, users can effectively control the agent with their own thoughts or preferences.

The flow chart generated by Low-code LLM is completed in one conversation, and the cost of calling the API is basically negligible, and this way of generating the flow chart at one time also avoids the problem of endless loops in AutoGPT, making the service more stable!

The author found that this work was placed in the Repo of Microsoft TaskMatrix.ai [2] , which has exceeded 30k stars. Visual ChatGPT [3] is also from the same team. TaskMatrix.AI shows how to connect foundation models and a large number of APIs in various fields to realize Task Automation (Visual ChatGPT is a classic example in the visual field). The newly launched Low-code LLM can play a role in interacting with users, helping users to make AI better understand what users want to do.

Paper address:
https://arxiv.org/abs/2304.08103

Thesis title:
"Low-code LLM: Visual Programming over LLMs."

Open source code:
https://github.com/microsoft/TaskMatrix/tree/main/LowCodeLLM

Demo:

work process

c113b7428e0ffe19ff27ec6382587b31.png
  1. Planning LLM generates a structured flowchart for complex tasks, which is somewhat similar to AutoGPT's self-planning through user-given goals

  2. Users modify the flow chart through defined low-code visual operations (including clicking, dragging, and text editing), and convey their preferences and opinions to LLM

  3. Executing LLM executes commands according to the user-modified workflow and generates answers

  4. Users can refer to the current answer to continuously modify the flowchart until a satisfactory result is obtained

48c17687c89604ebdb78028711e50a81.png

Predefined 6 types of low code operations

26beb03391d9feafb8f72441ce3b0d24.png

The advantages of this mode are as follows:

  1. More controllable results: users can directly understand and control the execution logic of artificial intelligence, making the results easier to predict and control, and more in line with user needs;

  2. User-friendly interactive interface: users can see the execution process intuitively, and the click and drag methods also make the operation more convenient and improve work efficiency;

  3. Wide range of application scenarios: This method can be applied in many fields, especially those scenarios where the user's ideas and preferences are crucial, and 4 typical cases are proposed in this paper.

In addition, Low-code LLM can also be extended with external APIs to further enrich scene applications. For example, efficiently communicate user ideas and preferences, and help users automate tasks. When docking with other tools, multiple functions such as vision and voice can be integrated.

Both AutoGPT and Low-code LLM are working hard to improve the performance and effect of artificial intelligence models. The former focuses on self-optimization and learning within the model, while the latter focuses on the collaboration and interaction between users and models. These two methods can complement each other to achieve better performance in different scenarios and tasks.

In the acknowledgment part of the paper, it is also mentioned that part of this article is generated through cooperation in this mode. It seems that in the future, people and large models will cooperate closely to create together. It is no longer a dream.

c16b07084f3ea659c5eef9d1e571f2af.png cbcb09840f757737e6f0a912e782174b.png ca32deaac498dde13e62a2fa9609f182.png

References

[1]

AutoGPT: https://github.com/Significant-Gravitas/Auto-GPT,

[2]

TaskMatrix.ai: https://arxiv.org/abs/2303.16434

[3]

Visual ChatGPT: https://arxiv.org/abs/2303.04671,

Guess you like

Origin blog.csdn.net/xixiaoyaoww/article/details/130418013