Flask or FastAPI? Python server first experience

1 Introduction

Recently, due to work needs, I went to learn about the related work of building simple python services, mainly for the use of models or tools developed by myself for people in the same group. The front-end framework Streamlit introduced earlier that is friendly to data science research can be said to be a sharp tool. However, with the popularity of ChatGPT, there are more and more chat-based services. streamlit has a chat derivative streamlit-chat , but it can only provide a simple chat function, and cannot have more advanced display, such as support markdown and streaming output, etc. Therefore, FastChat, which is more suitable for the front end of large models , may be a better choice.

Having said that, the front end is just a display interface, and the real service needs the back end. Strictly speaking, the backend provides data services, that is, adds, deletes, checks, and modifies data. It is essentially inseparable, but now it is not linked to the database, but to the model or model service.

At this time, when I went to search for python's back-end library, Flask, the longest-used web framework for small and medium-sized applications, appeared in front of me. But I still feel a bit complicated, so I thought of a FastAPI based on the secondary development of Starlette, which may be more suitable. (Here is a comparative introduction between them, introduction 1 , introduction 2 , introduction 3 , introduction 4 ). Of course, some people pointed out that comparing Flask and FastAPI is unfair, just like comparing apples and orange juice, which is sweeter. As a general web framework, Flask should be compared with Starlette. I also agree with this point. But this further shows that FastAPI is a better choice when rapid development is required and the requirements are not so heavy. (Of course, I also use Flask to develop services supported by multi-threading. Both have their own advantages.)

2. Hello world

As with any first example in any language, we start with a very simple example of providing a service using FastAPI.

The first step is to prepare the code app.py

from fastapi import FastAPI
from pydantic import BaseModel
from gpt import GPT

app = FastAPI()
model = GPT()

class Message(BaseModel):
    new_message: str
    role: str = ""
    args: dict = {
    
    }

@app.post('/gpt')
def gpt_endpoint(message: Message):
    new_message = message.new_message
    role = message.role
    args = message.args

    response = model.call(new_message=new_message, role=role, args=args)
    return response

From this less than 50 lines of code, most of the features of FastAPI are highlighted. You can see the main app, your own resource model, a data class Message, and a post interface service (named /gpt), and do some operations and return a response.

GPT here is a simplified class whose processing code can be written in another file.

The second step is to install the dependent library

pip install fastapi
pip install uvicorn
pip install pydantic

The third step is to run the service

uvicorn app:app --host 0.0.0.0 --port 8000

After the above three steps, a FastAPI service is built. The access address is for example: http://localhost:8000/gpt. How to quickly test it? FastAPI comes with Docs, which can be accessed through the URL (http://localhost:8000/docs), as shown in the following figure:
insert image description here

If there are enough comments in the code, there is no need to write another manual.

But the code just now is just a brief introduction to the features of FastAPI. What if I need to build more complex services? Below we introduce the use of FastAPI more comprehensively through two practical examples.

3. Expand actual combat

3.1 Use FastAPI to provide examples of adding, deleting, checking and modifying

Let's say we're developing a simple to-do management application. We want to achieve the following functionality:

  • Get a list of all todos
  • Create a new todo
  • Update the content of a specific todo
  • Mark specific todos as done
  • delete a specific todo

We can use FastAPI to achieve these functions. The following is sample code for API endpoints defined with different types of decorators:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

# 待办事项数据模型
class TodoItem(BaseModel):
    id: int
    title: str
    completed: bool = False

# 模拟的待办事项存储
todo_items = []

# 获取所有待办事项的列表
@app.get("/todos")
def get_todo_list():
    return todo_items

# 创建一个新的待办事项
@app.post("/todos")
def create_todo_item(item: TodoItem):
    todo_items.append(item)
    return item

# 更新特定待办事项的内容
@app.put("/todos/{item_id}")
def update_todo_item(item_id: int, item: TodoItem):
    for i in range(len(todo_items)):
        if todo_items[i].id == item_id:
            todo_items[i] = item
            return item

# 标记特定待办事项为已完成
@app.patch("/todos/{item_id}")
def complete_todo_item(item_id: int):
    for item in todo_items:
        if item.id == item_id:
            item.completed = True
            return item

# 删除特定待办事项
@app.delete("/todos/{item_id}")
def delete_todo_item(item_id: int):
    for item in todo_items:
        if item.id == item_id:
            todo_items.remove(item)
            return {
    
    "message": "Item deleted"}


In the above example, we have used the following different types of decorators:

  1. @app.get(path: str): Defines the endpoint of the GET request. An endpoint to get a list of all todos is defined using the @app.get("/todos") decorator. This endpoint does not need to receive any parameters and returns the todo list directly.

  2. @app.post(path: str): Defines the endpoint of the POST request. The endpoint for creating new todos is defined using the @app.post("/todos") decorator. This endpoint receives a request body parameter item, which is an instance of the TodoItem model and contains the content of the to-do item. In the handler function, we add the new todo item to the todo_items list and return the added todo item.

  3. @app.put(path: str): Defines the endpoint of the PUT request. The endpoint for updating todos is defined using the @app.put("/todos/{item_id}") decorator. This endpoint receives a path parameter item_id to specify the ID of the to-do item, and a request body parameter item, which is an instance of the TodoItem model and contains the new content of the to-do item. In the processing function, we traverse the todo_items list to find the todo item corresponding to the ID, update it with new content, and return the updated todo item.

  4. @app.patch(path: str): Defines the endpoint of the PATCH request. Use the @app.patch("/todos/{item_id}") decorator to define an endpoint that marks a todo as done. This endpoint accepts a path parameter item_id specifying the ID of the todo item. In the handler function, we iterate through the todo_items list to find the todo item with the corresponding ID, mark it as completed (item.completed = True), and return the updated todo item.

  5. @app.delete(path: str): Defines the endpoint of the DELETE request. The endpoint for deleting todos is defined using the @app.delete("/todos/{item_id}") decorator. This endpoint accepts a path parameter item_id specifying the ID of the todo item. In the processing function, we traverse the todo_items list to find the to-do item with the corresponding ID, delete it from the list, and then return a simple message indicating that the deletion was successful.

  6. These decorators allow us to define different types of endpoints according to the HTTP method (GET, POST, PUT, PATCH, DELETE), and to receive and process different data through path parameters and request body parameters. In this way, we can use a unified application to handle various operations and design our API according to the principles of RESTful API.

Here, we notice that the PUT and PATCH methods are very similar, both are used for updating, but they have the following differences:

PUT requests are used to completely replace (Replace) resources or entities on the server. When a client sends a PUT request, it needs to provide a complete representation of the resource, including all fields to be updated.
If the resource does not exist, a new resource is created; if the resource exists, the contents of the existing resource are completely replaced (overwritten).
Clients should generally provide a complete representation of the resource, even if only some fields have changed. This means that the client needs to provide all fields, not just the ones to be changed.
PUT requests are idempotent, executing the same PUT request multiple times will not have additional impact on resources.

PATCH requests are used to make partial updates to resources or entities on the server. When a client sends a PATCH request, it only needs to provide some of the fields or attributes to be updated.
PATCH requests can be used to incrementally update specific fields of a resource without sending a representation of the entire resource. This makes it suitable for cases where only some fields are updated.
The server selectively updates the corresponding fields of the resource according to the update content provided by the client. Fields not provided will remain unchanged.
PATCH requests can be idempotent or non-idempotent, depending on the specific implementation and use case.

In practical applications, you can choose to use PUT or PATCH requests according to specific business requirements and design criteria. If you want to update an entire resource or entity, you should use a PUT request. If only some fields or attributes need to be updated, a PATCH request should be used.

3.2 Example of using FastAPI to provide GPU loading language model

import uvicorn
import torch
from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel

# 示例GPT模型类
class GPTModel:
    def __init__(self):
        # 这里是你的GPT模型的初始化代码
        # ...

    def generate_text(self, input_text):
        # 这里是生成文本的代码
        # ...

# 示例后台任务函数
def load_model(background_tasks):
    # 加载和初始化GPT模型
    model = GPTModel()
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    model.eval()

    # 将模型保存到FastAPI应用的state中
    app.state.model = model

# 示例请求模型的输入
class TextRequest(BaseModel):
    text: str

app = FastAPI()

@app.post("/generate")
def generate_text(request: TextRequest, background_tasks: BackgroundTasks):
    # 在后台任务中加载模型
    background_tasks.add_task(load_model, background_tasks)

    # 从应用程序状态中获取模型
    model = app.state.model

    # 在GPU上进行推理
    input_text = request.text
    generated_text = model.generate_text(input_text)

    # 返回生成的文本结果
    return {
    
    "generated_text": generated_text}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

4. Summary

This article mainly briefly introduces some simple usages of FastAPI, which is convenient for quickly building prototype services.

(As large models such as ChatGPT gradually penetrate into our lives, blogs that may record technical details in this way are becoming more and more useless. Because ChatGPT can interactively give guidance for things that do not know, and there is no need to read blogs like this anymore .Although this blog is also inseparable from the instructions of ChatGPT, I still read some related blogs and tried the code given by ChatGPT by myself to increase some credibility.)

Guess you like

Origin blog.csdn.net/qq_35082030/article/details/130917423