[ChatGLM] ChatGLM-6B model Win+4GB graphics card local deployment notes

ChatGLM-6B is an open source conversational robot similar to ChatGPT released by the Knowledge Engineering and Data Mining Group of Tsinghua University. Since the model is trained with about 1T identifiers in Chinese and English, and most of them are in Chinese, it is very suitable for domestic use.

expected environment

Notes for local computer: Win10 Professional Edition + 32G memory + 256 solid-state system disk + 1T mechanical hard drive + 4G NVIDIA graphics card
python version: 3.10.7

Insert image description here

Download code GLM-6B

Insert image description here

Depends on Python libraries and versions

The file records the Python libraries and versions that ChatGLM-6B depends on, as follows:
Insert image description here

command execution

pip install -r requirements.txt

Insert image description here

Lucky@Lucky MINGW64 /e/ikbp/ChatGLM-6B (main)
$ pip install -r requirements.txt
Collecting protobuf
  Downloading protobuf-4.23.4-cp310-abi3-win_amd64.whl (422 kB)
     -------------------------------------- 422.5/422.5 kB 2.6 MB/s eta 0:00:00
Collecting transformers==4.27.1
  Downloading transformers-4.27.1-py3-none-any.whl (6.7 MB)
     ---------------------------------------- 6.7/6.7 MB 10.0 MB/s eta 0:00:00
Collecting cpm_kernels
  Downloading cpm_kernels-1.0.11-py3-none-any.whl (416 kB)
     ------------------------------------- 416.6/416.6 kB 12.7 MB/s eta 0:00:00
Collecting torch>=1.10
  Downloading torch-2.0.1-cp310-cp310-win_amd64.whl (172.3 MB)
     -------------------------------------- 172.3/172.3 MB 3.3 MB/s eta 0:00:00
Collecting gradio
  Downloading gradio-3.36.1-py3-none-any.whl (19.8 MB)
     ---------------------------------------- 19.8/19.8 MB 4.0 MB/s eta 0:00:00
Collecting mdtex2html
  Downloading mdtex2html-1.2.0-py3-none-any.whl (13 kB)
Collecting sentencepiece
  Downloading sentencepiece-0.1.99-cp310-cp310-win_amd64.whl (977 kB)
     -------------------------------------- 977.5/977.5 kB 8.9 MB/s eta 0:00:00
Collecting accelerate
  Downloading accelerate-0.20.3-py3-none-any.whl (227 kB)
     ------------------------------------- 227.6/227.6 kB 14.5 MB/s eta 0:00:00
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
     -------------------------------------- 268.8/268.8 kB 8.1 MB/s eta 0:00:00
Collecting tqdm>=4.27
  Downloading tqdm-4.65.0-py3-none-any.whl (77 kB)
     ---------------------------------------- 77.1/77.1 kB 4.2 MB/s eta 0:00:00
Collecting filelock
  Downloading filelock-3.12.2-py3-none-any.whl (10 kB)
Collecting numpy>=1.17
  Downloading numpy-1.25.0-cp310-cp310-win_amd64.whl (15.0 MB)
     ---------------------------------------- 15.0/15.0 MB 4.5 MB/s eta 0:00:00
Collecting requests
  Downloading requests-2.31.0-py3-none-any.whl (62 kB)
     ---------------------------------------- 62.6/62.6 kB ? eta 0:00:00
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp310-cp310-win_amd64.whl (151 kB)
     -------------------------------------- 151.7/151.7 kB 8.8 MB/s eta 0:00:00
Collecting packaging>=20.0
  Downloading packaging-23.1-py3-none-any.whl (48 kB)
     ---------------------------------------- 48.9/48.9 kB ? eta 0:00:00
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.3-cp310-cp310-win_amd64.whl (3.5 MB)
     ---------------------------------------- 3.5/3.5 MB 11.1 MB/s eta 0:00:00
Collecting regex!=2019.12.17
  Downloading regex-2023.6.3-cp310-cp310-win_amd64.whl (268 kB)
     -------------------------------------- 268.0/268.0 kB 8.3 MB/s eta 0:00:00
Collecting typing-extensions
  Downloading typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Collecting sympy
  Downloading sympy-1.12-py3-none-any.whl (5.7 MB)
     ---------------------------------------- 5.7/5.7 MB 7.6 MB/s eta 0:00:00
Collecting networkx
  Downloading networkx-3.1-py3-none-any.whl (2.1 MB)
     ---------------------------------------- 2.1/2.1 MB 11.0 MB/s eta 0:00:00
Collecting jinja2
  Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB)
     -------------------------------------- 133.1/133.1 kB 8.2 MB/s eta 0:00:00
Collecting altair>=4.2.0
  Downloading altair-5.0.1-py3-none-any.whl (471 kB)
     ------------------------------------- 471.5/471.5 kB 14.4 MB/s eta 0:00:00
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp310-cp310-win_amd64.whl (319 kB)
     -------------------------------------- 319.8/319.8 kB 9.7 MB/s eta 0:00:00
Collecting fastapi
  Downloading fastapi-0.100.0-py3-none-any.whl (65 kB)
     ---------------------------------------- 65.7/65.7 kB ? eta 0:00:00
Collecting gradio-client>=0.2.7
  Downloading gradio_client-0.2.7-py3-none-any.whl (288 kB)
     -------------------------------------- 288.4/288.4 kB 8.7 MB/s eta 0:00:00
Collecting mdit-py-plugins<=0.3.3
  Downloading mdit_py_plugins-0.3.3-py3-none-any.whl (50 kB)
     ---------------------------------------- 50.5/50.5 kB ? eta 0:00:00
Collecting orjson
  Downloading orjson-3.9.2-cp310-none-win_amd64.whl (195 kB)
     ------------------------------------- 195.7/195.7 kB 11.6 MB/s eta 0:00:00
Collecting httpx
  Downloading httpx-0.24.1-py3-none-any.whl (75 kB)
     ---------------------------------------- 75.4/75.4 kB ? eta 0:00:00
Collecting markdown-it-py[linkify]>=2.0.0
  Downloading markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
     ---------------------------------------- 87.5/87.5 kB ? eta 0:00:00
Collecting markupsafe
  Downloading MarkupSafe-2.1.3-cp310-cp310-win_amd64.whl (17 kB)
Collecting aiofiles
  Downloading aiofiles-23.1.0-py3-none-any.whl (14 kB)
Collecting semantic-version
  Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
Collecting uvicorn>=0.14.0
  Downloading uvicorn-0.22.0-py3-none-any.whl (58 kB)
     ---------------------------------------- 58.3/58.3 kB 3.2 MB/s eta 0:00:00
Collecting matplotlib
  Downloading matplotlib-3.7.2-cp310-cp310-win_amd64.whl (7.5 MB)
     ---------------------------------------- 7.5/7.5 MB 8.3 MB/s eta 0:00:00
Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting pillow
  Downloading Pillow-10.0.0-cp310-cp310-win_amd64.whl (2.5 MB)
     ---------------------------------------- 2.5/2.5 MB 11.4 MB/s eta 0:00:00
Collecting ffmpy
  Downloading ffmpy-0.3.0.tar.gz (4.8 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting pandas
  Downloading pandas-2.0.3-cp310-cp310-win_amd64.whl (10.7 MB)
     ---------------------------------------- 10.7/10.7 MB 7.2 MB/s eta 0:00:00
Collecting pydantic
  Downloading pydantic-2.0.2-py3-none-any.whl (359 kB)
     ------------------------------------- 359.1/359.1 kB 11.3 MB/s eta 0:00:00
Collecting websockets>=10.0
  Downloading websockets-11.0.3-cp310-cp310-win_amd64.whl (124 kB)
     ---------------------------------------- 124.7/124.7 kB ? eta 0:00:00
Collecting pygments>=2.12.0
  Downloading Pygments-2.15.1-py3-none-any.whl (1.1 MB)
     ---------------------------------------- 1.1/1.1 MB 10.3 MB/s eta 0:00:00
Collecting python-multipart
  Downloading python_multipart-0.0.6-py3-none-any.whl (45 kB)
     ---------------------------------------- 45.7/45.7 kB 2.4 MB/s eta 0:00:00
Collecting markdown
  Downloading Markdown-3.4.3-py3-none-any.whl (93 kB)
     ---------------------------------------- 93.9/93.9 kB ? eta 0:00:00
Collecting latex2mathml
  Downloading latex2mathml-3.76.0-py3-none-any.whl (73 kB)
     ---------------------------------------- 73.4/73.4 kB 3.9 MB/s eta 0:00:00
Collecting psutil
  Downloading psutil-5.9.5-cp36-abi3-win_amd64.whl (255 kB)
     ------------------------------------- 255.1/255.1 kB 16.3 MB/s eta 0:00:00
Collecting jsonschema>=3.0
  Downloading jsonschema-4.18.0-py3-none-any.whl (81 kB)
     ---------------------------------------- 81.5/81.5 kB ? eta 0:00:00
Collecting toolz
  Downloading toolz-0.12.0-py3-none-any.whl (55 kB)
     ---------------------------------------- 55.8/55.8 kB 2.8 MB/s eta 0:00:00
Collecting fsspec
  Downloading fsspec-2023.6.0-py3-none-any.whl (163 kB)
     -------------------------------------- 163.8/163.8 kB 9.6 MB/s eta 0:00:00
Collecting mdurl~=0.1
  Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Collecting linkify-it-py<3,>=1
  Downloading linkify_it_py-2.0.2-py3-none-any.whl (19 kB)
Collecting mdit-py-plugins<=0.3.3
  Downloading mdit_py_plugins-0.3.2-py3-none-any.whl (50 kB)
     ---------------------------------------- 50.4/50.4 kB ? eta 0:00:00
  Downloading mdit_py_plugins-0.3.1-py3-none-any.whl (46 kB)
     ---------------------------------------- 46.5/46.5 kB 2.3 MB/s eta 0:00:00
  Downloading mdit_py_plugins-0.3.0-py3-none-any.whl (43 kB)
     ---------------------------------------- 43.7/43.7 kB ? eta 0:00:00
  Downloading mdit_py_plugins-0.2.8-py3-none-any.whl (41 kB)
     ---------------------------------------- 41.0/41.0 kB ? eta 0:00:00
  Downloading mdit_py_plugins-0.2.7-py3-none-any.whl (41 kB)
     ---------------------------------------- 41.0/41.0 kB ? eta 0:00:00
  Downloading mdit_py_plugins-0.2.6-py3-none-any.whl (39 kB)
  Downloading mdit_py_plugins-0.2.5-py3-none-any.whl (39 kB)
  Downloading mdit_py_plugins-0.2.4-py3-none-any.whl (39 kB)
  Downloading mdit_py_plugins-0.2.3-py3-none-any.whl (39 kB)
  Downloading mdit_py_plugins-0.2.2-py3-none-any.whl (39 kB)
  Downloading mdit_py_plugins-0.2.1-py3-none-any.whl (38 kB)
  Downloading mdit_py_plugins-0.2.0-py3-none-any.whl (38 kB)
  Downloading mdit_py_plugins-0.1.0-py3-none-any.whl (37 kB)
INFO: pip is looking at multiple versions of markdown-it-py[linkify] to determine which version is compatible with other requirements. This could take a while.
Collecting markdown-it-py[linkify]>=2.0.0
  Downloading markdown_it_py-2.2.0-py3-none-any.whl (84 kB)
     ---------------------------------------- 84.5/84.5 kB ? eta 0:00:00
Collecting pytz>=2020.1
  Downloading pytz-2023.3-py2.py3-none-any.whl (502 kB)
     ------------------------------------- 502.3/502.3 kB 10.5 MB/s eta 0:00:00
Collecting python-dateutil>=2.8.2
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     -------------------------------------- 247.7/247.7 kB 7.4 MB/s eta 0:00:00
Collecting tzdata>=2022.1
  Downloading tzdata-2023.3-py2.py3-none-any.whl (341 kB)
     ------------------------------------- 341.8/341.8 kB 10.7 MB/s eta 0:00:00
Collecting colorama
  Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting click>=7.0
  Downloading click-8.1.4-py3-none-any.whl (98 kB)
     ---------------------------------------- 98.2/98.2 kB 5.5 MB/s eta 0:00:00
Collecting h11>=0.8
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
     ---------------------------------------- 58.3/58.3 kB 3.0 MB/s eta 0:00:00
Collecting multidict<7.0,>=4.5
  Downloading multidict-6.0.4-cp310-cp310-win_amd64.whl (28 kB)
Collecting charset-normalizer<4.0,>=2.0
  Downloading charset_normalizer-3.2.0-cp310-cp310-win_amd64.whl (96 kB)
     ---------------------------------------- 96.9/96.9 kB 5.4 MB/s eta 0:00:00
Collecting aiosignal>=1.1.2
  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.9.2-cp310-cp310-win_amd64.whl (61 kB)
     ---------------------------------------- 61.0/61.0 kB 3.2 MB/s eta 0:00:00
Collecting async-timeout<5.0,>=4.0.0a3
  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting attrs>=17.3.0
  Downloading attrs-23.1.0-py3-none-any.whl (61 kB)
     ---------------------------------------- 61.2/61.2 kB ? eta 0:00:00
Collecting frozenlist>=1.1.1
  Downloading frozenlist-1.3.3-cp310-cp310-win_amd64.whl (33 kB)
Collecting starlette<0.28.0,>=0.27.0
  Downloading starlette-0.27.0-py3-none-any.whl (66 kB)
     ---------------------------------------- 67.0/67.0 kB ? eta 0:00:00
Collecting pydantic-core==2.1.2
  Downloading pydantic_core-2.1.2-cp310-none-win_amd64.whl (1.5 MB)
     ---------------------------------------- 1.5/1.5 MB 9.5 MB/s eta 0:00:00
Collecting annotated-types>=0.4.0
  Downloading annotated_types-0.5.0-py3-none-any.whl (11 kB)
Collecting certifi
  Downloading certifi-2023.5.7-py3-none-any.whl (156 kB)
     -------------------------------------- 157.0/157.0 kB 9.2 MB/s eta 0:00:00
Collecting httpcore<0.18.0,>=0.15.0
  Downloading httpcore-0.17.3-py3-none-any.whl (74 kB)
     ---------------------------------------- 74.5/74.5 kB 4.0 MB/s eta 0:00:00
Collecting idna
  Downloading idna-3.4-py3-none-any.whl (61 kB)
     ---------------------------------------- 61.5/61.5 kB ? eta 0:00:00
Collecting sniffio
  Downloading sniffio-1.3.0-py3-none-any.whl (10 kB)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.40.0-cp310-cp310-win_amd64.whl (1.9 MB)
     ---------------------------------------- 1.9/1.9 MB 10.4 MB/s eta 0:00:00
Collecting pyparsing<3.1,>=2.3.1
  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
     ---------------------------------------- 98.3/98.3 kB 5.5 MB/s eta 0:00:00
Collecting cycler>=0.10
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.4.4-cp310-cp310-win_amd64.whl (55 kB)
     ---------------------------------------- 55.3/55.3 kB 3.0 MB/s eta 0:00:00
Collecting contourpy>=1.0.1
  Downloading contourpy-1.1.0-cp310-cp310-win_amd64.whl (470 kB)
     -------------------------------------- 470.4/470.4 kB 9.8 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1
  Downloading urllib3-2.0.3-py3-none-any.whl (123 kB)
     -------------------------------------- 123.6/123.6 kB 7.1 MB/s eta 0:00:00
Collecting mpmath>=0.19
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ------------------------------------- 536.2/536.2 kB 11.2 MB/s eta 0:00:00
Collecting anyio<5.0,>=3.0
  Downloading anyio-3.7.1-py3-none-any.whl (80 kB)
     ---------------------------------------- 80.9/80.9 kB ? eta 0:00:00
Collecting jsonschema-specifications>=2023.03.6
  Downloading jsonschema_specifications-2023.6.1-py3-none-any.whl (17 kB)
Collecting referencing>=0.28.4
  Downloading referencing-0.29.1-py3-none-any.whl (25 kB)
Collecting rpds-py>=0.7.1
  Downloading rpds_py-0.8.8-cp310-none-win_amd64.whl (180 kB)
     ------------------------------------- 180.3/180.3 kB 11.3 MB/s eta 0:00:00
Collecting uc-micro-py
  Downloading uc_micro_py-1.0.2-py3-none-any.whl (6.2 kB)
Collecting six>=1.5
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting exceptiongroup
  Downloading exceptiongroup-1.1.2-py3-none-any.whl (14 kB)
Using legacy 'setup.py install' for ffmpy, since package 'wheel' is not installed.
Installing collected packages: tokenizers, sentencepiece, pytz, pydub, mpmath, ffmpy, cpm_kernels, websockets, urllib3, uc-micro-py, tzdata, typing-extensions, toolz, sympy, sniffio, six, semantic-version, rpds-py, regex, pyyaml, python-multipart, pyparsing, pygments, psutil, protobuf, pillow, packaging, orjson, numpy, networkx, multidict, mdurl, markupsafe, markdown, latex2mathml, kiwisolver, idna, h11, fsspec, frozenlist, fonttools, filelock, exceptiongroup, cycler, colorama, charset-normalizer, certifi, attrs, async-timeout, annotated-types, aiofiles, yarl, tqdm, requests, referencing, python-dateutil, pydantic-core, mdtex2html, markdown-it-py, linkify-it-py, jinja2, contourpy, click, anyio, aiosignal, uvicorn, torch, starlette, pydantic, pandas, mdit-py-plugins, matplotlib, jsonschema-specifications, huggingface-hub, httpcore, aiohttp, transformers, jsonschema, httpx, fastapi, accelerate, gradio-client, altair, gradio
  Running setup.py install for ffmpy: started
  Running setup.py install for ffmpy: finished with status 'done'
Successfully installed accelerate-0.20.3 aiofiles-23.1.0 aiohttp-3.8.4 aiosignal-1.3.1 altair-5.0.1 annotated-types-0.5.0 anyio-3.7.1 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.2.0 click-8.1.4 colorama-0.4.6 contourpy-1.1.0 cpm_kernels-1.0.11 cycler-0.11.0 exceptiongroup-1.1.2 fastapi-0.100.0 ffmpy-0.3.0 filelock-3.12.2 fonttools-4.40.0 frozenlist-1.3.3 fsspec-2023.6.0 gradio-3.36.1 gradio-client-0.2.7 h11-0.14.0 httpcore-0.17.3 httpx-0.24.1 huggingface-hub-0.16.4 idna-3.4 jinja2-3.1.2 jsonschema-4.18.0 jsonschema-specifications-2023.6.1 kiwisolver-1.4.4 latex2mathml-3.76.0 linkify-it-py-2.0.2 markdown-3.4.3 markdown-it-py-2.2.0 markupsafe-2.1.3 matplotlib-3.7.2 mdit-py-plugins-0.3.3 mdtex2html-1.2.0 mdurl-0.1.2 mpmath-1.3.0 multidict-6.0.4 networkx-3.1 numpy-1.25.0 orjson-3.9.2 packaging-23.1 pandas-2.0.3 pillow-10.0.0 protobuf-4.23.4 psutil-5.9.5 pydantic-2.0.2 pydantic-core-2.1.2 pydub-0.25.1 pygments-2.15.1 pyparsing-3.0.9 python-dateutil-2.8.2 python-multipart-0.0.6 pytz-2023.3 pyyaml-6.0 referencing-0.29.1 regex-2023.6.3 requests-2.31.0 rpds-py-0.8.8 semantic-version-2.10.0 sentencepiece-0.1.99 six-1.16.0 sniffio-1.3.0 starlette-0.27.0 sympy-1.12 tokenizers-0.13.3 toolz-0.12.0 torch-2.0.1 tqdm-4.65.0 transformers-4.27.1 typing-extensions-4.7.1 tzdata-2023.3 uc-micro-py-1.0.2 urllib3-2.0.3 uvicorn-0.22.0 websockets-11.0.3 yarl-1.9.2

[notice] A new release of pip available: 22.2.2 -> 23.1.2
[notice] To update, run: python.exe -m pip install --upgrade pip

Pre-training files

Download the INT4 quantized pre-training result file. The download address of the INT4 quantized pre-training file is: https://huggingface.co/THUDM/chatglm-6b-int4/tree/main . It should be noted that the official download address of the model on Tsinghua Cloud is provided on GitHub, but that only contains the pre-training result file, that is, the bin file, but in fact, the operation of ChatGLM-6B requires the configuration file of the model, that is, config.json etc., as shown in the figure below:
Insert image description here

Download all files from HuggingFace to local computer. After downloading all the above files, just save them to a local directory. I saved them in: E:\ikbp\chatglm-6b-int4
Insert image description here

If the output of the above code is True, then congratulations, you have installed the cuda version of torch (note that if you have a graphics card, you need to download cuda and cudann to install it successfully. You can find tutorials online for this part).

Insert image description here
##Win+GPU deployment solution

Necessary conditions for Win+GPU solution

To deploy the GPU version of ChatGLM-6B, you need to install the cuda version of torch. You need to check whether your torch is correct. You can check it with the following command (the following is the python code):

pip install torch

Insert image description here

Download cuda and cudann installation

Detailed explanation of installing CUDA and cuDNN in win10

Download and install CUDA

Download and install cuDNN

  • Download cuDNN
    download address: https://developer.nvidia.com/rdp/cudnn-download

Insert image description here
Just choose the appropriate version to download. Note: Be sure to choose the version that matches the CUDA you installed.
Insert image description here

Insert image description here

appendix

https://online2023.worldaic.com.cn/exhibition
ChatGLM-6B model - Windows+6GB graphics card local deployment
ChatGLM-6B/blob/main/requirements.txt
THUDM chatglm-6b-int4
win10 installation of CUDA and cuDNN detailed explanation of
CUDA Toolkit 12.2 Downloads
cuDNN Archive

Guess you like

Origin blog.csdn.net/u010638673/article/details/131610607