AI technology newsletter: Tsinghua open source ChatGLM2 bilingual dialogue language model

insert image description here

ChatGLM2-6B is an open source project that provides code and resources for the ChatGLM2-6B model. Based on the search results provided, here is an introduction to the project:

Paper: https://arxiv.org/pdf/2103.10360.pdf
insert image description here

ChatGLM2-6B is an open source bilingual dialogue language model, which is the second generation version of the ChatGLM-6B model. It retains the smooth dialogue and low barriers to deployment of the original model, and introduces some new features and improvements.

ChatGLM2-6B has the following features and functions:

More powerful performance: ChatGLM2-6B uses GLM's hybrid objective function, and has undergone large-scale pre-training and human preference alignment training. The evaluation results show that on multiple data sets, ChatGLM2-6B has significantly improved performance compared with the original model, and has strong competitiveness.

Longer context: By introducing FlashAttention technology, ChatGLM2-6B extends the context length of the pedestal model from ChatGLM-6B's 2K to 32K, and uses a context length of 8K for training in the dialogue stage. This enables ChatGLM2-6B to process longer contextual information.

More efficient reasoning: Based on Multi-Query Attention technology, ChatGLM2-6B has more efficient reasoning speed and lower memory usage. Under the implementation of the official model, the reasoning speed of ChatGLM2-6B has increased by 42% compared with the original model, and under INT4 quantization, the dialogue length supported by 6G video memory has been increased from 1K to 8K.

Open protocol: The weights of ChatGLM2-6B are completely open to academic research, and after filling out the questionnaire for registration, free commercial use is also allowed.

Source code: https://github.com/THUDM/ChatGLM2-6B

insert image description here

Guess you like

Origin blog.csdn.net/weixin_41194129/article/details/132031361