The original author of Xi Xiaoyao's science and technology
| iven
AutoGPT [1] , which is popular all over the Internet, has more than 100,000 collections on Github. This self-planning, self-executing agent is the first to focus on self-adjustment and optimization within the artificial intelligence model.
However, many netizens found that the performance of AutoGPT is unstable, and the endless loop is the most common phenomenon. In addition, the execution speed of AutoGPT is very slow. According to the test of netizens, New Bing needs 8s for the task, and AutoGPT took a full 8 minutes!
The way AutoGPT works makes it call the API many times for a single task. According to calculations, the cost of a single task exceeds 100 yuan! Obviously such costs are expensive for personal use.
The new work of Microsoft Research recently proposed Low-code LLM, which collaborates with agents through simple visual operations by dragging and dropping.
This mode first allows GPT to generate a task flow chart, which is very similar to AutoGPT's self-planning and self-execution logic, but the difference is that users can intuitively and easily understand and modify the entire execution process, thereby effectively controlling artificial intelligence. operate.
It is called "Low-code" because it adopts the concept of visual programming, and users can adjust the process by simply clicking and dragging. For complex tasks, users can effectively control the agent with their own thoughts or preferences.
The flow chart generated by Low-code LLM is completed in one conversation, and the cost of calling the API is basically negligible, and this way of generating the flow chart at one time also avoids the problem of endless loops in AutoGPT, making the service more stable!
The author found that this work was placed in the Repo of Microsoft TaskMatrix.ai [2] , which has exceeded 30k stars. Visual ChatGPT [3] is also from the same team. TaskMatrix.AI shows how to connect foundation models and a large number of APIs in various fields to realize Task Automation (Visual ChatGPT is a classic example in the visual field). The newly launched Low-code LLM can play a role in interacting with users, helping users to make AI better understand what users want to do.
Paper address:
https://arxiv.org/abs/2304.08103
Thesis title:
"Low-code LLM: Visual Programming over LLMs."
Open source code:
https://github.com/microsoft/TaskMatrix/tree/main/LowCodeLLM
Demo:
work process
Planning LLM generates a structured flowchart for complex tasks, which is somewhat similar to AutoGPT's self-planning through user-given goals
Users modify the flow chart through defined low-code visual operations (including clicking, dragging, and text editing), and convey their preferences and opinions to LLM
Executing LLM executes commands according to the user-modified workflow and generates answers
Users can refer to the current answer to continuously modify the flowchart until a satisfactory result is obtained
Predefined 6 types of low code operations
The advantages of this mode are as follows:
More controllable results: users can directly understand and control the execution logic of artificial intelligence, making the results easier to predict and control, and more in line with user needs;
User-friendly interactive interface: users can see the execution process intuitively, and the click and drag methods also make the operation more convenient and improve work efficiency;
Wide range of application scenarios: This method can be applied in many fields, especially those scenarios where the user's ideas and preferences are crucial, and 4 typical cases are proposed in this paper.
In addition, Low-code LLM can also be extended with external APIs to further enrich scene applications. For example, efficiently communicate user ideas and preferences, and help users automate tasks. When docking with other tools, multiple functions such as vision and voice can be integrated.
Both AutoGPT and Low-code LLM are working hard to improve the performance and effect of artificial intelligence models. The former focuses on self-optimization and learning within the model, while the latter focuses on the collaboration and interaction between users and models. These two methods can complement each other to achieve better performance in different scenarios and tasks.
In the acknowledgment part of the paper, it is also mentioned that part of this article is generated through cooperation in this mode. It seems that in the future, people and large models will cooperate closely to create together. It is no longer a dream.
References
[1]
AutoGPT: https://github.com/Significant-Gravitas/Auto-GPT,
[2]TaskMatrix.ai: https://arxiv.org/abs/2303.16434
[3]Visual ChatGPT: https://arxiv.org/abs/2303.04671,