Test the performance of Alibaba Tongyi Qianwen-7B-Chat

Test the performance of Alibaba Tongyi Qianwen-7B-Chat

0. Background

In order to understand the performance of Alibaba Tongyi Qianwen-7B-Chat, a few questions were asked to test.

1. Actual test results (screenshot)

sample code,

import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = 'sk-1234567890abcdefghijklmnopqrstuvwxyz1234567890DL'
openai.api_base = 'http://localhost:8000/v1'
openai.api_base = 'http://localhost:8000/v1'
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,
    )
    return response.choices[0].message["content"]

start testing,

get_completion("你是谁?")

The output is as follows,

insert image description here

sample code,

get_completion("世界上第二高的山峰是哪座")

The output is as follows,

insert image description here
sample code,

get_completion("鲁迅和周树人是什么关系?")

The output is as follows,

insert image description here
sample code,

get_completion("一个球和一个球棒的总价是11美元,球棒比球贵10美元,球的价格是多少?")

The output is as follows,

insert image description here
This answer is wrong, the correct answer is $0.5.

Let's append some prompt words, sample code,

get_completion("请仔细思考,一步一步计算下面的数学题,最后在做验证。一个球和一个球棒的总价是11美元,球棒比球贵10美元,球的价格是多少?")

The output is as follows,

insert image description here
This answer is correct.

end!

Guess you like

Origin blog.csdn.net/engchina/article/details/132504252