Article directory
Install
Environment: windows11
Graphics card: NVIDIA GeForce RTX 3090 version 526.98, memory 24G
Text generation web UI
https://github.com/oobabooga/text-generation-webui
Download oobabooga-windows.zip
and unzip G:\llm\oobabooga_windows\
, run directlystart_window.bat
I encountered an error in the middle
and followed the prompts to install Visual C++ 14.0. The installation address is listed in the error content.
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
Download and run as shown in the figure below, focusing on installing the C++ CMake tool
Backup: Instead of the above method, you can actually install several dependent packages to solve the problem, and it has not been verified whether this can be solved
conda install libpython m2w64-toolchain -c msys2
After fixing the above problem, delete text-generation-webui
it and run it again start_windows.bat
. This step returns to the end and allows you to choose to download the model. Let’s close this window and download it manually.
load a different model
huggingface search model page https://huggingface.co/models?sort=downloads
You can switch between different models in the started webui, the effect of the command line is as shown in the figure below
vicuna 7b
Vicuna's code is based on Stanford Alpaca, with additional support for multiple rounds of dialogue. And used similar hyperparameters to Stanford Alpaca.
LLaMA-7b
The model cannot be used directly ( click me to view details ), you need to go through the following steps
LLaMA-7b
Model weight format conversionvicuna-7b-delta-v1.1
Merge with the transformed model
download model
# 下载这两个模型到任意目录
# 下载已经转换的llama权重
https://huggingface.co/decapoda-research/llama-7b-hf
# vicuna-7b
https://huggingface.co/lmsys/vicuna-7b-delta-v1.1
Merge weights, generate vicuna-7b-all-v1.1
folders, and finally copy them to text-generation-webui/models
the model directory
# 安装fastchat
pip install fastchat
# vicuna权重合并
python -m fastchat.model.apply_delta \
--base llama-7b-hf \
--delta vicuna-7b-delta-v1.1 \
--target vicuna-7b-all-v1.1
oobabooga
run in directorystart_windows.bat
vicuna-13b-GPTQ-4bit-128g (enough resources, you can directly vicuna-13b)
GPTQ can be used by directly loading it. It’s not as troublesome as before. Using this version is poor. You can do it yourself.
Model address https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g
Double click to run cmd_windows.bat
the file
# 进入 text-generation-webui 目录
cd text-generation-webui
# 下载模型
python download-model.py
enterL
non8231489123/vicuna-13b-GPTQ-4bit-128g
The model will be automatically downloaded without entering the domain name and saved in the models
file
Wait patiently for the file to download, then close the window,
oobabooga
Run in the directory start_windows.bat
, and it's over here
chatglm6b
The method is as above, and the downloaded model can oobabooga
be run directly in the directory start_windows.bat
without repeating
https://huggingface.co/THUDM/chatglm-6b