ChatGPT replacement coming? This article summarizes the open source replacement of ChatGPT, you deserve it

[AIGC Selection] Summarize the open source replacement of ChatGPT, you deserve it

In 2023, there seem to be only two camps left in the field of chatbots: "OpenAI's ChatGPT" and "others".

Coupled with the release of GPT-4 some time ago, ChatGPT also has more powerful reasoning and multimodal capabilities, and it is almost impossible for OpenAI to open source it.

OpenAI shared a large number of GPT-4 benchmarks and test results, but basically did not provide the data used for training, the cost, or the method used to create the model.

insert image description here

However, the underperforming "other" camp has been doing open source efforts.

Some large models currently available for download in the industry:

  • GLM-10B/130B: Bilingual (Chinese and English) two-way dense model.
  • OPT-2.7B/13B/30B/66B: Meta open source pre-trained language model.
  • LLaMA-7B/13B/30B/65B: Meta open source basic large language model.
  • Alpaca (LLaMA-7B): A powerful and reproducible intelligent following model, the seed tasks are all in English, and the collected data is also in English, so the trained model is not optimized for Chinese.
  • Alpaca-Lora: Stanford Alpaca is fine-tuning on the entire LLaMA model, while Alpaca-Lora uses Lora technology to freeze the original model LLaMA parameters by adding additional network layers to the model and only training these newly added network layer parameters. Due to the small number of these new parameters, not only the cost of fine-tuning is significantly reduced, but also the effect similar to that of full-model fine-tuning can be obtained.
  • BELLE (BLOOMZ-7B/LLaMA-7B): The project is based on Stanford Alpaca and optimized for Chinese. The model tuning only uses the data produced by ChatGPT (does not contain any other data).
  • ChatGLM-6B: A Conversational Language Model for Chinese and English Bilinguals.

Original Portal: [AIGC Selected] Summarize the open source replacement of ChatGPT, you deserve it

1. Stanford releases Alpaca 7B, performance comparable to GPT-3.5

In the face of the threat of large models such as ChatGPT, open source replacement is a good choice.

At the end of February, Meta "open sourced" a new large model series - LLaMA (Large Language Model Meta AI), with parameters ranging from 7 billion to 65 billion. The 13 billion parameter LLaMA model outperforms the 175 billion parameter GPT-3 "on most benchmarks" and can run on a single V100 GPU.

Portal: Meta releases the latest large model LLaMA: the parameter scale is smaller, and a single card can outperform GPT-3

insert image description here

Legend: Some achievements made by LLaMA

翻译:
- 2月24日,LLaMA发布,并在非商业许可下提供给政府、社区和学术界的研究人员和实体工作者;
- 3月2日,4chan网友泄露了全部的LLaMA模型;
- 3月10日,Georgi Gerganov创建了llama.cpp工具,可以在搭载M1/M2芯片的Mac上运行LLaMA;
- 3月11日:通过llama.cpp可以在4GB RaspberryPi上运行7B模型,但速度比较慢,只有10秒/token;
- 3月12日:LLaMA 7B在一个node.js执行工具NPX上成功运行;
- 3月13日:llama.cpp可以在Pixel 6手机上运行;
- 3月14日,斯坦福Alpaca「羊驼」发布。

On March 14, Stanford fine-tuned a new model Alpaca with 7 billion parameters based on LLaMA 7B. They used the technology introduced in the Self-Instruct paper to generate 52K instruction data, and made some modifications at the same time. In the preliminary human evaluation In , the Alpaca 7B model performs similarly to the text-davinci-003 (GPT-3.5) model on Self-Instruct instruction evaluation.

From the perspective of parameter scale, Alpaca is far smaller than text-davinci-003, and the mobile terminal can even run a 7B lightweight language model.

In terms of performance, the results of the research team's tests showed that Alpaca's output was good and reflected that the instructions followed the general style of the dataset. For example, Alpaca usually outputs more concise answers than ChatGPT, which is similar to text-davinci-003. Still, Alpaca exhibits several common pitfalls of language models, including hallucinations, toxicity, and stereotyping.

In terms of cost, the dataset involved in the research was generated by the Stanford team using OpenAI's API for less than $500. Also, it takes 3 hours to fine-tune a 7B LLaMA model on 8 80GB A100s, which costs less than $100 for most cloud computing providers. Therefore, the Stanford team reproduced an AI model with performance comparable to GPT-3.5 for less than $600.

In terms of deployment, Alpaca fine-tuned based on LLaMA can be easily deployed locally. It doesn't matter if you don't have a graphics card, Apple notebooks, even Raspberry Pi, and mobile phones can play.

In the end, the team made the data set (save $500 per second, and has higher data diversity than the original paper) and codes all open source, and now everyone can fine-tune a dialogue AI with explosive effects: copy a GPT- AI with 3.5 effects is cheap, easy, and small.

insert image description here

Project address:
https://github.com/tatsu-lab/stanford_alpaca

Trial address:
https://alpaca-ai-custom6.ngrok.io/

Demo address:
https://crfm.stanford.edu/alpaca/

2. To make up for the shortcomings of Stanford Alpaca in Chinese, the Chinese large model BELLE is open source

Since Alpaca's seed tasks are all in English, and the collected data is also in English, the trained model is not optimized for Chinese. In order to improve the effect of the dialogue model in Chinese, the BELLE (Bloom-Enhanced Large Language model Engine) of the open source Chinese dialogue model with 7 billion parameters is here.

BELLE was completed based on Stanford's Alpaca, optimized for Chinese, and made some modifications to the generated code. The model tuning only uses the data produced by ChatGPT (does not contain any other data).

insert image description here

Project address:
https://github.com/LianjiaTech/BELLE

In terms of data, the project has open sourced the data collection code based on Alpaca. Based on this code, about 1 million pieces of Chinese data were generated, combined with 50,000 pieces of English data from Alpaca, and the checkpoint obtained from BLOOMZ-7B model training was uploaded on Hugging Face .

Hugging Face address:
https://huggingface.co/BelleGroup

The project also adopts instruction learning datasets of different sizes (200,000, 600,000, 1 million, and 2 million samples) to train the model, and obtains different model versions as follows:
insert image description here

The project also uses the corresponding data set to train and tune the model based on LLAMA-7B, which is now open:
insert image description here

For the specific model training method, please refer to the example of Hugingface Transformers, and for the SFT method, please refer to the training code of Alpaca.

3. Domestic AI large model ChatGLM-6B starts internal testing

On March 14th, Zhipu AI, a company transformed from Tsinghua University's technological achievements, open sourced a new member of the GLM series model - the Chinese-English bilingual dialogue model ChatGLM-6B. test application website http://chatglm.cn). It should be noted that currently ChatGLM can only conduct a maximum of 5 back and forth in each round of dialogue, and each time a maximum of 1,000 words can be entered.

ChatGLM-6B is based on the General Language Model (GLM) architecture with 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards.

Specifically, ChatGLM-6B has the following characteristics:

  • Sufficient bilingual pre-training in Chinese and English: ChatGLM-6B has trained 1T tokens on the 1:1 ratio of Chinese and English materials, and has bilingual ability.
  • Optimized model architecture and size: With 6.2 billion parameters, it can be deployed on consumer graphics cards.
  • Longer sequence length: Compared with GLM-10B (sequence length 1024), ChatGLM-6B has a sequence length of 2048, supporting longer conversations and applications.
  • Human Intent Alignment Training: Supervised Fine-Tuning, Feedback Bootstrap, Human Feedback Reinforcement Learning (RLHF) and other methods are used to enable the model to initially have the ability to understand human instruction intentions.

Therefore, ChatGLM-6B has better dialogue and question answering ability under certain conditions.

At the same time, ChatGLM-6B also has defects such as small model capacity, harmful explanations or biased content, weak multi-round dialogue ability, insufficient English ability, and easy to be misled.

However, some people on the Internet questioned that the model was trained with ChatGPT data and added some Chinese corpus.
insert image description here

Project address:
https://github.com/THUDM/ChatGLM-6B

Official blog:
https://chatglm.cn/blog

4. The Chinese Alpaca model Luotuo is open source

Alpaca is a model fine-tuned by the Stanford team based on LLaMA 7B on 52k instructions, which can be well adapted to a variety of natural language application scenarios.

Stanford Alpaca is a practical and cheap fine-tuning method for ordinary researchers, but it still requires a large amount of calculation. Moreover, Alpaca's seed tasks are all in English, and the collected data is also in English, so the trained model is not optimized for Chinese.

In order to further reduce the cost of fine-tuning, another researcher from Stanford, Eric J. Wang, used LoRA (low-rank adaptation) technology to reproduce the results of Alpaca. Specifically, Eric J. Wang used an RTX 4090 graphics card to train a model at the same level as Alpaca in only 5 hours, reducing the demand for computing power of such models to the consumer level. Also, the model can be run on a Raspberry Pi (for research purposes).

insert image description here

Legend: Alpaca-LoRA project released by Eric J. Wang

Project address:
https://github.com/tloen/alpaca-lora

This is very suitable for researchers who want to train their own ChatGPT-like models (including the Chinese version of ChatGPT-like) but do not have top-level computing resources.

After the Alpaca-LoRA project came out, in order to improve the effect of the dialogue model in Chinese, camel Luotuo, an open source Chinese language model from SenseTime and Huazhong University of Science and Technology, is based on LLaMA, Stanford Alpaca, Alpaca LoRA, Japanese-Alpaca-LoRA, etc. Complete, a single card can complete the training deployment.

A little episode, the reason why the model is named camel is because both LLaMA (alpaca) and Alpaca (alpaca) belong to the Artiodactyla order - Camelidae.

insert image description here

Project address:
https://github.com/LC1332/Chinese-alpaca-lora

At present, the project has released training corpus and model weight files (two models: luotuo-lora-7b-0.1, luotuo-lora-7b-0.3), for developers to use their own corpus of various sizes to train their own language model, and apply it to the corresponding vertical field.

insert image description here

The current Chinese version of the Alpaca-LoRa model, Luotuo, can conduct simple Chinese dialogues and questions and answers, and developers can optimize the model by introducing data from other fields.

However, there is still a gap between luotuo-lora-7b-0.1 (0.1) and luotuo-lora-7b-0.3 (0.3). When the user asked about the address of Central China Normal University, 0.1 gave the wrong answer.

insert image description here

5. Open API of Claude, the strongest competitor of ChatGPT

Anthropic, an artificial intelligence startup founded by former OpenAI employees (former GPT-3 core members), released an AI assistant similar to ChatGPT called "Claude". Anthropic claims it aims for "safer" and "less harmful" AI, but at a higher cost. Two versions of Claude are currently available: Claude and Claude Instant. However, the initial version of Claude does not have access to the Internet.

Because of the increasing focus on commercial applications after the first deal between OpenAI and Microsoft in 2019, 11 former employees who left OpenAI "after disagreeing with the company's direction" founded "Anthropic".

Shortly after publishing the article "Constituional Artificial Intelligence: Harmlessness from Artificial Intelligence Feedback" in December last year, Anthropic launched its own chatbot Claude, but it does not have an open interface and can only be used in applications developed by various partners. experience.

Claude is known as the strongest competitor of ChatGPT. Similar to ChatGPT, Claude has superb dialogue skills and can handle tasks such as summarization, search, creativity, collaborative writing, question and answer, and code writing.

On March 14, Anthropic opened the Claude API and provided two versions of the model: Claude and Claude Instant. Among them, Claude Instant has lower latency, slightly worse performance, and is cheaper than the full Claude-v1. The context window of both models is 9000 tokens (about 5000 words, or 15 pages).

  • Claude Instant, a faster, cheaper model that can handle a range of tasks including casual conversation, text analysis, summarization, and document question answering. Pricing: Input: $0.43/million characters, Output: $1.45/million characters.
  • Claude-v1, Anthropic's best product so far, is good at generating from complex dialogues and creative content. Pricing: Input: $2.9/million characters, Output: $8.6/million characters.

Friendly reminder: The price of ChatGPT 3.5 (gpt-3.5-turbo interface) is $2.7 per million tokens, which is about the same price as the Claude-v1 version.

insert image description here

Both Claude and ChatGPT rely on reinforcement learning (RL) to train a preference model, and the selected replies will be used for subsequent fine-tuning of the model, but the specific model development methods are different. Claude employs a training technique Anthropic calls Constitutional AI (CAI).

CAI builds on RLHF, except that CAI's ranking process uses models (rather than humans) to provide an initial ranking for all generated outputs.

insert image description here

Experience address: https://www.anthropic.com/product

Since late 2020, Claude has been in quiet beta testing with launch partners including AssemblyAI, DuckDuckGo, Notion, Quora, and Robin AI, with products including the DuckAssist tool for DuckDuckGo, Quora's AI chat app Poe, and AI writing Assistant Notion AI provides support.

Claude也在积极的与外部伙伴紧密合作,涵盖问答社区、在线教育、法律领域、数字媒体等。

- Quora:通过人工智能聊天应用程序Poe向用户提供Claude的对话能力。
- Juni Learning:一个在线教育解决方案提供商,利用Claude模型对Discord Juni Tutor Bot的能力进行增强,通过在线辅导的方式提升学生的学术能力。
- Notion:Claude 与 Notion 合作以提高工作和学习场景中的生产力。
- DuckDuckGo:一个隐私浏览器,与Anthropic合作,推出全新工具DuckAssist,可以自动从维基百科中提取和总结信息,并对用户的问题进行回答。
- Robin AI:法律领域软件提供商,使用 Claude 来评估合同的特定部分,并提出对客户更友好的新的替代语言。
- AssemblyAI:数字媒体,Claude 帮助其 API 平台大规模转录和理解音频数据。

The main advantages of Claude are the following: Claude is more controllable than ChatGPT, Claude is more honest than ChatGPT, and Claude is more harmless than ChatGPT.

Customers during the closed beta reported that Claude generates little to no harmful output, is easier to talk to, is more manipulable, doesn't require elaborate language organization, and can be engineered to character, tone, and behavior, among other things.

Welcome everyone to pay attention to my personal WeChat public account: HsuDan , I will share more of my learning experience, pit avoidance summary, interview experience, and the latest AI technology information.

Reference:
https://crfm.stanford.edu/2023/03/13/alpaca.html
https://www.thepaper.cn/newsDetail_forward_22302856
https://www.ithome.com/0/681/614.htm
https ://replicate.com/blog/fine-tune-alpaca-with-lora?continueFlag=4ecae39885197a5c008faabbefb5c824
https://www.thepaper.cn/newsDetail_forward_22455425
https://zhuanlan.zhihu.com/p/616929229

Guess you like

Origin blog.csdn.net/u012744245/article/details/129874532