ChatGLM3简介

ChatGLM3简介和使用

ChatGLM3

简介

ChatGLM3 是智谱AI和清华大学 KEG 实验室联合发布的新一代对话预训练模型。ChatGLM3-6B 是 ChatGLM3 系列中的开源模型,在保留了前两代模型对话流畅、部署门槛低等众多优秀特性的基础上,ChatGLM3-6B 引入了如下特性:

  1. **更强大的基础模型:**ChatGLM3-6B-Base 具有在 10B 以下的基础模型中最强的性能。

    Model GSM8K MATH BBH MMLU C-Eval CMMLU MBPP AGIEval
    ChatGLM2-6B-Base 32.4 6.5 33.7 47.9 51.7 50.0 - -
    Best Baseline(10B 以下) 52.1 13.1 45.0 60.1 63.5 62.2 47.5 45.8
    ChatGLM3-6B-Base 72.3 25.7 66.1 61.4 69.0 67.5 52.4 53.7
    Model 平均 Summary Single-Doc QA Multi-Doc QA Code Few-shot Synthetic
    ChatGLM2-6B-32K 41.5 24.8 37.6 34.7 52.8 51.3 47.7
    ChatGLM3-6B-32K 50.2 26.6 45.8 46.1 56.2 61.2 65
  2. 更完整的功能支持: ChatGLM3-6B 采用了全新设计的Prompt 格式。

    • 多轮对话
    • 同时原生支持工具调用(Function Call)
    • 代码执行(Code Interpreter)
    • Agent 任务
  3. 更全面的开源序列:

    Model Seq Length Download
    ChatGLM3-6B 8k HuggingFace | ModelScope
    ChatGLM3-6B-Base 8k HuggingFace | ModelScope
    ChatGLM3-6B-32K 32k HuggingFace | ModelScope

推理代码

from modelscope import AutoTokenizer, AutoModel, snapshot_download
model_dir = snapshot_download("ZhipuAI/chatglm3-6b", revision = "master")
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)

命令行加载

import os
import platform
from transformers import AutoTokenizer, AutoModel

model_path = "model/chatglm3_32k/"

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).cuda()
# 多显卡支持,使用下面两行代替上面一行,将num_gpus改为你实际的显卡数量
# from utils import load_model_on_gpus
# model = load_model_on_gpus(model_path, num_gpus=2)
model = model.eval()

os_name = platform.system()
clear_command = 'cls' if os_name == 'Windows' else 'clear'
stop_stream = False

welcome_prompt = "欢迎使用 ChatGLM3-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序"

def build_prompt(history):
    prompt = welcome_prompt
    for query, response in history:
        prompt += f"\n\n用户:{
      
      query}"
        prompt += f"\n\nChatGLM3-6B:{
      
      response}"
    return prompt

def main():
    past_key_values, history = None, []
    global stop_stream
    print(welcome_prompt)
    while True:
        query = input("\n用户:")
        if query.strip() == "stop":
            break
        if query.strip() == "clear":
            past_key_values, history = None, []
            os.system(clear_command)
            print(welcome_prompt)
            continue
        print("\nChatGLM:", end="")
        current_length = 0
        for response, history, past_key_values in model.stream_chat(tokenizer, query, history=history,temperature=1,
                                                                    past_key_values=past_key_values,
                                                                    return_past_key_values=True):
            if stop_stream:
                stop_stream = False
                break
            else:
                print(response[current_length:], end="", flush=True)
                current_length = len(response)
        print(history)
        print("")
        # print(past_key_values)


if __name__ == "__main__":
    main()

猜你喜欢

转载自blog.csdn.net/qq128252/article/details/134673506