ChatGLM (alpha internal test version: QAGLM) is a Chinese-English bilingual model open sourced by Tsinghua University with Q&A and dialogue functions. It is based on the General Language Model (GLM) architecture and has 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (only 6GB of video memory is required at the INT4 quantization level). ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese Q&A and dialogue. After about 1T identifiers of Chinese-English bilingual training, supplemented by supervision and fine-tuning, feedback self-help, human feedback reinforcement learning and other technologies, ChatGLM-6B with 6.2 billion parameters has been able to generate answers that are quite in line with human preferences.
What does it mean, it means that as long as you have a computer with 6GB video memory, you can have your own chat robot! The official also provides a Gradio-based web version Demo and a command line Demo.
How to use, see below:
When using it, you first need to download the warehouse:
git clone https://github.com/THUDM/ChatGLM-6B
cd ChatGLM-6B
First install Gradio: pip install gradio
, then run web_demo.py from the repository:
python web_demo.py
The program will run a Web Server and output the address. Open the output address in a browser to use it. The latest version of Demo realizes the effect of a typewriter, and the speed experience is greatly improved.
If you don't have GPU hardware, you can also do inference on the CPU, but the inference speed will be slower. The method of use is as follows (requires about 32GB of memory)
Next, look at the official example:
self-awareness
outline writing
copywriting
travel guide