大语言模型工程化服务系列之三--------姜子牙大模型fastapi接口服务

姜子牙大语言模型fastapi接口服务

姜子牙大模型效果还可以,但是如何将它的模型文件部署成自己的服务呢,下面是教程代码

一、安装环境

python 版本 3.7
transformer最新版本

二、姜子牙fastapi接口服务代码

1.服务端代码

代码如下(示例):

import uvicorn
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import AutoTokenizer
from transformers import LlamaForCausalLM
import torch

app = FastAPI()


class Query(BaseModel):
    text: str


device = torch.device("cuda")

model = LlamaForCausalLM.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1', device_map="auto")
tokenizer = AutoTokenizer.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1')


@app.post("/generate_travel_plan/")
async def generate_travel_plan(query: Query):
    inputs = '<human>:' + query.text.strip() + '\n<bot>:'

    input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
    generate_ids = model.generate(
        input_ids,
        max_new_tokens=1024,
        do_sample=True,
        top_p=0.85,
        temperature=1.0,
        repetition_penalty=1.,
        eos_token_id=2,
        bos_token_id=1,
        pad_token_id=0)

    output = tokenizer.batch_decode(generate_ids)[0]
    return {
    
    "result": output}


if __name__ == "__main__":
    uvicorn.run(app, host="192.168.138.218", port=7861)

2.调用代码

代码如下(示例):

import requests

url = "http:/192.168.138.210:7861/generate_travel_plan/"
query = {
    
    "text": "帮我写一份去西安的旅游计划"}

response = requests.post(url, json=query)

if response.status_code == 200:
    result = response.json()
    print("Generated travel plan:", result["result"])
else:
    print("Error:", response.status_code, response.text)


3.postman curl调用代码

curl --location 'http://192.168.138.210:7861/generate_travel_plan/' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--data '{"text":""}'


总结

例如:以上就是今天要讲的内容,本文介绍了姜子牙大语言模型fastapi服务的搭建。
需要进垂直领域大模型训练交流群的私信我

猜你喜欢

转载自blog.csdn.net/weixin_43228814/article/details/131289462
今日推荐