FreeWilly2 open source language model fine-tuned based on LLaMA-2

FreeWilly2Is released Stability AIbased on the fine-tuning , the model has even been released . As of press time, the model is in the middle of the rankings , and most languages ​​are also in the middle .Llama2 70B语言模型部分推理能力超越openAIGPT-3.5HuggingFace开源语言模型排行榜榜首言模型加载工具积极适配中

It seems that the open source language model is finally going to change. After all, various technologies are emerging in endlessly. As the webmaster said, it is only a matter of time before it surpasses oepnAI, which is built behind closed doors and is no longer open.

Model description

FreeWilly2 is a Llama2 70B model fine-tuned on the Orca-style dataset. Stability AI and its CarperAI Labs are proud to announce FreeWilly1 and its successor FreeWilly2 , two powerful new open-access large language models (LLMs). Both models demonstrated excellent inference capabilities on various benchmarks. FreeWilly1 leverages the original LLaMA 65B base model and fine-tunes it using the standard Alpaca format with a new synthetic dataset. Similarly, FreeWilly2 leverages the LLaMA 2 70B base model, whose performance compares favorably with GPT-3.5 on some tasks.

Data generation and collection

The training of the FreeWilly model is directly inspired by the approach pioneered by Microsoft in their paper "Orca: Progressive learning from complex explanation tracking of GPT-4". While our data generation processes are similar, we differ in our data sources.

Our dataset variant contains 600,000 data points (approximately 10% the size of the dataset used in the original Orca paper), which were generated by language model prompting the following high-quality instruction dataset created by Enrico Shippole Generated:

  1. COT Submix original data set
  2. NIV2 Submix raw data set
  3. FLAN 2021 Submix original data set
  4. T0 Submix original data set

With this approach, we generated 500,000 examples using a simpler LLM model and an additional 100,000 examples using a more complex LLM model. To ensure a fair comparison, we carefully screened these datasets and removed examples from the evaluation benchmarks. Although the training sample size is only one-tenth of the original Orca paper (significantly reducing the cost and carbon footprint of training the model compared to the original paper), the resulting FreeWilly model

Guess you like

Origin blog.csdn.net/u010291330/article/details/132580807