OpenAI Assistants-API Concise Tutorial

  At OpenAI’s developer conference on November 6, in addition to announcing new models such as gpt4-v and gpt-4-turbo, there is also an assistants-api. Based on the assistants-api, developers can build their own AI assistants. Currently, There are three types of tools available for assistants-api. The first is the previously popular code interpreter (Code Interpreter), which became a hit when chatgpt-plus members came online. The second is file retrieval (Retrieval). Using Retrieval, you can plug in your own knowledge base in assistants. There is also function calling, which I won’t go into more detail about. assistants-api is still in beta version, but judging from OpenAI's planning, it should support DALLE3, gpt4-v and even plugin in the future. We can look forward to it.

  What is the difference between using assistants-api and using chat-api? The first is that the chat API can only use the chat capability of the model, and if you have used it before, you will find that the chat conversation history needs to be maintained by yourself, which is very inconvenient. In addition to the chat capability, assistants-api can also call a powerful interpreter (Code Interpreter), external functions (Functions Calling), and can also plug in its own knowledge base (Retrieval). Mainly, you don’t need it. To maintain the conversation history, just focus on the conversation itself. If assistants-api later supports plugin, DALLE3 and gpt4-v, you can think of it as an API version of chatGPT-Plus. Of course, the functions can be fully customized. I believe you will be ready to customize your own after seeing this. AI assistant.

  Before we officially start development, let’s first understand some core objects of Assistants-API.
Insert image description here

object effect
Assistant Assistant can use an entity based on the specified model. If the assistant is compared to a person, it refers to a specific person with certain abilities.
Thread If there is no suitable translation, it will not be translated here. This can be considered as contextual dialogue information for communication with the assistant. Just like when you communicate with a customer service of a certain product, the entire conversation can be considered as a Thread.
Run There is no suitable translation. It can be thought that you initiate a conversation with the assistant. The entire conversation response process and status changes in the project can be regarded as a run. In a run, there can not only be responses from the model, but also function calls. , code interpreter call, file recall...
Run Step For the details of each step of Run, you can see the entire assistant's running process, which is mainly to facilitate troubleshooting and assistant optimization.

  Knowing these concepts, we can start to implement our own Assistant. In order to better understand the entire Assistant development process, we still use a specific example to complete the development of the entire function. Suppose we need to develop a florist financial assistant. Its main function is to count the costs and income based on the flowers we sell every day, and finally calculate the income. and costs are saved to the database.

  Here I have prepared an excel table (flower_prices.xlsx) in advance to record the cost and selling price of all flowers (fictitious data, does not represent the real price).
Insert image description here
  Now we will officially start the development and use of our flower shop financial assistant.

Create assistant

  Here we first need to convert our flower_prices.csv into a file that Assistant can use, just use the following code:

from openai import OpenAI
client = OpenAI(base_url='https://thales.xindoo.xyz/openai/v1/')
# 将文件上传至openAI保存
file = client.files.create(
  file=open("flower_prices.csv", "rb"),
  purpose='assistants'
)

  Next we define the function to save billing information. For details, please refer to my last blogOpenAI’s multiple function calls

# 定义保存账单的方法
def save_bill(totalCost, totalIncome):
    '''保存总成本和总的收入'''
    print(totalCost, totalIncome)
    return "success"

function = {
    
    
        "type": "function",
        "function": {
    
    
            "name": "save_bill",
            "description": "保存总成本和总的收入",
            "parameters": {
    
    
                "type": "object",
                "properties": {
    
    
                    "totalCost": {
    
    
                        "type": "number",
                        "description": "总成本",
                    },
                    "totalIncome": {
    
    
                        "type": "number",
                        "description": "总收入",
                    }
                },
                "required": ["totalCost", "totalIncome"],
            },
        }
    }
available_functions = {
    
     "save_bill": save_bill}  

Create assistant

  Here we need to call the API to pass all switches, files and function call information to OpenAI to create an assistant of our own.

# 创建助手,将code_interpreter,retrieval,function都开启
assistant = client.beta.assistants.create(
  name="花店财务助手",
  description="按照每种花的售出量,统计成本和收入,计算出总利润",
  model="gpt-4-1106-preview",
  tools=[{
    
    "type": "code_interpreter"}, {
    
    "type": "retrieval"}, function],
  file_ids=[file.id]
)

CreateThread

  As mentioned above, Thread is the context information of the conversation between the user and the Assistant. The user definitely needs to create a context for the first conversation with the Assistant. The code is very simple, as follows:

# 创建对话Thread
thread = client.beta.threads.create(
  messages=[
    {
    
    
      "role": "user",
      "content": "我卖出去了红玫瑰3支、郁金香2支、百合6支,计算下总成本和总收入,给出具体的计算过程"
    }
  ]
)

  Here we see that Thead is not associated with Assistant. We guess that a Thread object is just created in the local code. In fact, there is no operation on OpenAI. This may be because OpenAI uses lazy loading to reduce the pressure on the server.

Create Run

# 创建Run
run = client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id
)

  The method of creating Run is also very simple. You can see that you only need to pass two parameters: thread_id and assistant_id, and both IDs are strings, especially assistant_id. You can view them in the background of the OpenAI website. Yes, I believe everyone has guessed it here. Assistant and Thread do not need to create new ones every time.

# 从Run中获取结果
run = client.beta.threads.runs.retrieve(
  thread_id=thread.id,
  run_id=run.id
)

Get run status

  After Run is created, you need to let OpenAI run. Here you need to call the Retrieve method to get the running results of Run. If you print out the run, you may see similar information.

Run(id='run_A9phobcoIOG3euibElksTu8a', assistant_id='asst_hW7NrPZP8q8KvE9oiuceg5mM', cancelled_at=None, completed_at=None, created_at=1700400089, expires_at=1700400689, failed_at=None, file_ids=['file-uhMIBtm4BPXlJlY1UzGIPlGn'], instructions=None, last_error=None, metadata={}, model='gpt-4-1106-preview', object='thread.run', required_action=None, started_at=1700400089, status='in_progress', thread_id='thread_nvsTyK6DQdmKoVxOseSSKZF4', tools=[ToolAssistantToolsCode(type='code_interpreter'), ToolAssistantToolsRetrieval(type='retrieval'), ToolAssistantToolsFunction(function=FunctionDefinition(name='save_bill', parameters={'type': 'object', 'properties': {'totalCost': {'type': 'number', 'description': '总成本'}, 'totalIncome': {'type': 'number', 'description': '总收入'}}, 'required': ['totalCost', 'totalIncome']}, description='保存总成本和总的收入'), type='function')])

  What is obtained here is the latest status of run. It is possible that run has not been executed yet, so it may be necessary to keep looping and wait for the status of run to change. Run has the following states.
Insert image description here
The specific status and meaning are as follows:

state definition
queued When Runs are created for the first time or retrieve is called to obtain status, they will become queued and waiting to run. Under normal circumstances, it will soon become in_progress state.
in_progress Indicates that run is being executed. At this time, you can call run step to view the specific execution process.
completed After the execution is completed, you can obtain the message returned by the Assistant, and you can continue to ask questions to the Assistant.
requires_action If the Assistant needs to perform a function call, it will go to this state, and then you must call the specified method with the given parameters, and then run can continue to run.
expired When the function call output is not submitted before expires_at, run will expire. In addition, if the output is not obtained before expires_at, run will also become expired state.
cancelling When you call the client.beta.threads.runs.cancel(run_id=run.id, thread_id=thread.id) method, run will become canceling. After successful cancellation, it will become callcelled.
cancelled Run was successfully canceled.
failed If the run fails, you can view the reason for the failure by viewing the last_error object in Run.

  You need to pay special attention to the requires_action status here. This requires the code to execute some functions locally. After the execution is completed, the results are returned to the Assistant, and then run can continue to run.

run trigger function call

  If run.status is requires_action, we need to call the local tool. Of course, now there is only a function call, and then return the result of the function call to the Assistant so that it can continue to execute. The code is as follows:

if run.status == 'requires_action':
    tool_outputs=[]
    # 调用并保存所有函数调用的结果
    for call in run.required_action.submit_tool_outputs.tool_calls:
        if call.type != "function":
            continue
        # 获取真实函数
        function = available_functions[call.function.name]
        output = {
    
    
            "tool_call_id": call.id,
            "output": function(**call.function.arguments),
        }
        tool_outputs.append(output)
    # 将函数调用的结果回传给Assistant
    run = client.beta.threads.runs.submit_tool_outputs(
        thread_id=thread.id,
        run_id=run.id,
        tool_outputs=tool_outputs
    )

Get Assistant's message

  Next, I only need to poll the retrieve interface to get the latest status of the run. If the status is completed, I can read the return result of the Assistant.

# 获取run的最新状态。 
run = client.beta.threads.runs.retrieve(
  thread_id=thread.id,
  run_id=run.id
)
if run.status == 'completed':
    messages = client.beta.threads.messages.list(
      thread_id=thread.id
    )
    print(messages)

  Note here that the messages are arranged in reverse order, so the latest message is at the top.

Start new message

  The above process is from the creation of Assistant to the initiation of the first message. If we need to continue the conversation immediately following the previous process, we only need to add a new message to the thread, and then create and execute run. The code is as follows:

# 添加新消息
message = client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content="另外还有2支向日葵,补充下这份账单"
)
# 创建run
run = client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id
)
# 获取执行结果
run = client.beta.threads.runs.retrieve(
  thread_id=thread.id,
  run_id=run.id
)

Conclusion

  The above is the overall development process of Assistants-API. After understanding these processes, you can easily build a personal assistant like ChatGPT-Plus. Of course, Assistants-API is still in the beta stage and has many imperfect functions, such as not supporting streaming returns, not supporting image generation, not supporting plug-in calling... and even the run status needs to be polled to obtain... In addition, when I was writing the demo for this article, I found that Retrivel's text content recall success rate is very low, resulting in a very low bill calculation success rate (it may also be a problem with the text format I gave). In addition, the success rate of code_interpreter is also very low, and it often fails to run. No wonder it is still a beta version. We can only hope that the official can optimize it in the future.

  In addition, I have not mentioned some interfaces for viewing and management of assistant, thread, run, and run step. You can refer to the official website documentation for details.Official website documents< a i=2>. If you need to try out the Assistants-API, you can also go to the official websitehttps://platform.openai.com/assistants to experience it first. After the trial is completed, you can Then completely translate the page configuration into code, and then embed it into your own application.

I have uploaded the complete code to Githubhttps://github.com/xindoo/openai-examples/blob/main/flower_assistant.ipynb, I will also upload subsequent usage examples of other OpenAI APIs to this warehouse. If you are interested, you can pay attention.

Guess you like

Origin blog.csdn.net/xindoo/article/details/134494270