Zhipu AI official announcement: ChatGLM2-6B is free for commercial use

The development team of the Chinese-English bilingual large model ChatGLM2-6B - Zhipu AI & Tsinghua KEG announced last night that the weights of ChatGLM-6B and ChatGLM2-6B are completely open to academic research from now on , and free commercial use is allowed after completing enterprise registration and obtaining authorization .

ChatGLM2-6B is the second-generation version of the open source Chinese-English bilingual dialogue model ChatGLM-6B . On the basis of retaining many excellent features of the first-generation model, such as smooth dialogue and low deployment threshold, ChatGLM2-6B introduces the following new features:

  • More powerful performance: Based on the development experience of the first generation model of ChatGLM, the base model of ChatGLM2-6B has been fully upgraded. ChatGLM2-6B uses the mixed objective function of GLM, and after 1.4T pre-training of Chinese and English identifiers and human preference alignment training, the evaluation results show that compared with the original model, ChatGLM2-6B has MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%) and other data sets have achieved substantial improvement in performance, and have strong competitiveness in open source models of the same size.
  • Longer context: Based on FlashAttention technology, the project team extended the context length (Context Length) of the pedestal model from 2K of ChatGLM-6B to 32K, and used 8K context length training in the dialogue stage, allowing more rounds of dialogue . However, the current version of ChatGLM2-6B has limited ability to understand single-round ultra-long documents, and we will focus on optimization in subsequent iterative upgrades.
  • More efficient reasoning: Based on Multi-Query Attention technology, ChatGLM2-6B has more efficient reasoning speed and lower video memory usage: under the official model implementation, the reasoning speed is 42% higher than that of the first generation. Under INT4 quantization, 6G The dialogue length supported by video memory has been increased from 1K to 8K.

Example comparison

Compared with the original model, the ability of ChatGLM2-6B in multiple dimensions has been improved. The following are some comparison examples.

mathematical logic

knowledge reasoning

long document comprehension

Guess you like

Origin www.oschina.net/news/249475