Microsoft has won again! Jointly released Llama 2, an open source AI model for free commercial applications

8454abd380554cfc08c7d05845d0b31e.gif

Organize | Tu Min

Listing | CSDN (ID: CSDNnews)

  • Former competitors, today's cooperative allies;

  • Suddenly, like a spring breeze overnight, the open source large model ushered in a new situation;

  • Today is the day of OSS AI victory;

With Meta's latest release of a new open source AI model - Llama 2, there are endless voices of praise on the Internet, even Yann LeCun, Turing Award winner, father of convolutional network, and Meta's chief artificial intelligence scientist, said bluntly, "This will change the pattern of the LLM market."

c5f85078a3c0ce2bd9ea647575b947d9.png

The reason why Llama 2 can cause such a big response is not only because it is open source, but also because it can be used freely for research and commercial purposes. Meanwhile, Meta has teamed up with Microsoft to power applications like OpenAI's ChatGPT, Bing Chat, and other modern chatbots.

In Meta’s view, “an open approach is the right approach for AI model development today, especially in the rapidly evolving generative field. By making AI models publicly available, they can benefit everyone. Giving companies, start-ups, entrepreneurs, and researchers the tools they develop at a scale they would have struggled to build on their own, backed by computing power they may not have access to, will allow them to experiment in exciting ways.”

This alone is something that many companies that currently focus on large-scale model development cannot do, and as commented by netizens, the pattern is opened up at once.

c5ad2371fdc80c53c538fc98e7486221.png

Predecessor to Llama 2

Llama 2, released today, is the successor to Llama (the llama).

In February of this year, Meta released LLaMA publicly for the first time as an open-source version with a non-commercial license. This is an advanced foundational large-scale language model designed to help researchers advance the work of this subfield of AI. Smaller, higher-performing models such as LLaMA enable others in the research community who do not have access to extensive infrastructure to study these models, further democratizing access to this important and rapidly changing field.

At that time, Meta offered LLaMA in several sizes (7B, 13B, 33B and 65B parameters). In terms of functionality alone, Llama can generate text and code based on prompts, on par with other chatbot-like systems.

However, at that time, Meta decided to restrict access to the model due to fear of being abused, so it was only open to researchers with certain qualifications, and an application form was required.

However, what is unexpected is that soon after, someone leaked the weight of LLaMA (including the parameter value file of the trained neural network) to the torrent website, making the LLaMA large model that is not fully open spread widely in the AI ​​​​community in a short period of time.

Soon, many fine-tuned LLaMA models sprung up, and the "Alpaca" family was too crowded for a while. For example, Stanford released Alpaca (Alpaca), UC Berkeley open sourced Vicuna (Little Alpaca), Washington University proposed QLoRA and open sourced Guanaco (Guanaco). Domestic Harbin Institute of Technology also fine-tuned a "Hua Camel" based on the LLaMA model instructions based on Chinese medical knowledge.

Today, the release of Llama 2 takes this open-source large model to a new level. Compared with the previous generation Llama model, after training with mixed public data, the performance of Llama 2 has been significantly improved.

aa5189c09cf47465b372bd9aba08de12.png

Llama 2: Varying parameters

To this end, Meta released a 76-page paper "Llama 2: Open Foundation and Fine-Tuned Chat Models" detailing the pre-training, fine-tuning, security and other related work of the Llama 2 large model.

497cc8fb1b907a0aa74adad4c141cc7b.png

  • Paper address: https://scontent-lax3-2.xx.fbcdn.net/v/t39.2365-6/10000000_663429262362723_1696968207443577320_n.pdf?_nc_cat=101&ccb=1-7&_nc_sid=3c67a6&_nc_o hc=5ol-jUSglG4AX_EKgWk&_nc_ht=scontent-lax3-2.xx&oh=00_AfC4pQWErthyr1jwgSScKeyjXW3wwEUnqvIh7MNeb-Et3g&oe=64BBB691

According to the paper, there are two versions of Llama 2: Llama 2  and  Llama 2-Chat , which is fine-tuned for two-way conversations. Llama 2 and Llama 2-Chat are further broken down into versions of varying complexity: 7 billion parameters, 13 billion parameters, and 70 billion parameters.

053f4dd4b406202bebe5e83e7fdbc3fd.png

Meta has increased the size of the Llama 2 pre-training corpus by 40%. This model (basic model) has been trained with 2 trillion tokens, and the context window contains 4096 tokens, which is doubled compared to the previous generation. The context window determines how long the model can process at one time. In terms of hardware, Meta uses NVIDIA A100.

Meta also said that the Llama 2 fine-tuned model was developed for chat apps similar to ChatGPT and has been trained on "more than 1 million human annotations."

2c33a3ec248187a8ad8c03ab6ff9fc19.png

However, Meta did not disclose the specific source of the training data in the paper, only saying that it came from the Internet, which did not include data from Meta's products or services.

According to official benchmarks, Llama 2 leads the field of open source models. Among them, the Llama 2 70B model outperforms all open source models.

abe7351d0fbf40fd3cfb66bcbbdadb1b.png

Compared with closed-source large models, Llama 2 70B is close to GPT-3.5 on inference tasks, but has a significant gap on encoding benchmarks. At the same time, its performance is still not comparable to OpenAI's GPT-4 and PaLM-2-L, and Llama 2 is significantly behind GPT-4 in terms of computer programming.

467abe087b6e9c2dd6a20535fa1070f6.png

Talking about the real advantages of Llama 2 this time, Nvidia Senior AI Scientist Jim Fan spoke highly of it:

  • Llama-2 could cost more than $20 million to train. Meta does an incredible service to the community by releasing models with a commercially friendly license. AI researchers at big companies were wary of Llama-1 due to licensing issues, but now I think many of them will jump in and contribute.

  • Meta's team conducted a human study on 4K cues to assess whether Llama-2 would be useful. They use "winning ratio" as a metric for comparing models, similar in spirit to the Vicuna benchmark. The 70B model is roughly on par with GPT-3.5-0301 and significantly outperforms Falcon, MPT, and Vicuna.

    I trust these real human ratings more than academic benchmarks.

  • Llama-2 has not reached the level of GPT-3.5, mainly because of its weak encoding ability. On "HumanEval" (the standard coding benchmark), it is not as good as StarCoder or many other models designed specifically for coding. Still, I have no doubt that Llama-2 will be significantly improved with its open weights.

  • The Meta team has left no stone unturned when it comes to AI safety. In fact, almost half of the paper is devoted to security, redlining, and evaluation. We applaud this responsible effort!

    In previous studies, there was a tricky trade-off between helpfulness and safety. Meta alleviates this problem by training two independent reward models. These models are not yet open source, but are very valuable to the community.

  • I think Llama-2 will greatly advance multimodal AI and robotics research. These areas require more than black box access to APIs.

    So far, we have had to convert complex sensory information (video, audio, 3D perception) into textual descriptions, which are then fed into LLMs, which is clumsy and causes a lot of information loss. It would be more effective to directly graft sensory modules onto powerful LLMs.

  • Llama 2's thesis is a masterpiece in itself. Unlike GPT-4's technical paper, which shared very little information, Llama-2 details the whole thing, including model details, training stages, hardware, data pipeline, and annotation process. For example, the paper provides a systematic analysis of the impact of RLHF and provides nice visualizations.

  • Quoting section 5.1: "We argue that the superior writing ability of LLMs to surpass human annotators in certain tasks is fundamentally driven by RLHF".

8ce6227589985f753d67575ef11f3338.png

Source: https://twitter.com/DrJimFan/status/1681372700881854465

However, it is worth noting that although Llama 2 allows commercial use, it also adds an additional commercial clause to the community license agreement:

If, as of the date of the release of Llama 2, the number of monthly active users of products or services offered by Licensee or Licensee's Affiliates exceeds 700 million in the previous calendar month, you must apply for a license from Meta, which Meta may grant to you in its sole discretion, and you shall not be entitled to exercise any rights hereunder unless or until such rights are expressly granted to you by Meta .

aba744f0a9046bd23f006493a042848b.png

This means that some big manufacturers, such as Amazon and Google, want to use Llama 2, and there are still certain restrictions.

359c60f6286bc84fcc2f028199c75c7c.png

718cb919b9fab3b56b69bd8ba8d75adb.png

Meta teams up with Microsoft

Of course, Meta did not reject all major manufacturers. In this official announcement, Meta announced an in-depth cooperation with Microsoft.

Among them, as the preferred partner of Llama 2, Microsoft, Meta said that starting today, Llama 2 is available in the Azure AI model catalog. Based on this, developers using Microsoft Azure can use Llama 2 to build and take advantage of its cloud-native tools for content filtering and security functions.

438ee307549129f791d51d3f77582f31.png

At the same time, Llama 2 is also optimized to run natively on Windows, providing a seamless workflow for developers and a generative AI experience for customers across different platforms. Llama 2 is also available through Amazon Web Services (AWS), Hugging Face, and other providers.

Some netizens commented that Microsoft won this wave again!

0c45f896787f16037f132410cc9d045f.png

In addition to cooperating with Microsoft, Meta also cooperated with Qualcomm. Qualcomm announced, "It plans to support Llama 2-based AI deployment on flagship smartphones and PCs from 2024, enabling developers to use the AI ​​capabilities of the Snapdragon platform to launch exciting new generative AI applications."

37d3ca6fed21d036aeec0fb596f0040d.png

There is no 100% perfect mockup

However, for Llama 2, Meta also admits that it is not absolutely perfect, because its tests cannot capture all real-world scenarios, and its benchmarks may lack diversity, in other words, do not adequately cover areas such as coding and human reasoning.

Meta also acknowledges that Llama 2, like all generative AI models, is biased at some level. For example, due to the imbalance of the training data and the presence of "toxic" text in the training data, it may create "illusion" and generate "toxic" content.

To that end, Meta has chosen to work with Microsoft as part of a partnership that also includes using Azure AI Content Safety, a service designed to detect "inappropriate" content in AI-generated images and text to reduce toxic Llama 2 output on Azure.

At the same time, Meta emphasized in the paper that Llama 2 users must comply with Meta's license terms and acceptable use policy in addition to complying with the "safe development and use" guidelines, to reduce deviant content to a certain extent.

43929fb8cc6e070d6629e951c0237d1e.png

The Future of Open Source Mockups

Finally, if OpenAI leads the large-scale model track, then Meta opens up a new door for open-source large-scale models.

In the way of open source, more innovations are brought together, and the open source of Llama 2 also brings more possibilities to the predicted "in the future, open source large models will dominate the development direction of the entire large model".

This is also as concluded by Ars Technica: the arrival of open source AI models not only encourages transparency (in terms of the training data used to make the models), but also promotes economic competition (not restricting technology to large companies), encourages free speech (no censorship), and democratizes access to AI (no paywall restrictions).

At the same time, in order to avoid the potential controversy of Llama 2's open source, Meta also released a statement entitled "Supporting Meta's Open Approach to Today's Artificial Intelligence." It wrote:

“We support an open innovation approach to AI. Responsible and open innovation provides us all with the opportunity to participate in the AI ​​development process, bringing visibility, scrutiny and trust to these technologies. The Llama model that is open today will allow everyone to benefit from this technology.”

Up to now, nearly a hundred AI experts have participated in the signature, including Drew Houston (CEO of Dropbox), Matt Bornstein (Partner of Andreessen Horowitz), Julien Chaumond (Chief Technology Officer of Hugging Face), Lex Fridman (Research Scientist of MIT) and Paul Graham (Founding Partner of Y Combinator), etc.

Of course, it cannot be ignored that whether it is open source or closed source large models, they all face complex legal issues, because they need to determine whether there are copyrighted resources in the data pool used for training. How to effectively avoid these problems has also become a matter that these large-scale model development companies need to solve in the next stage.

Currently, anyone can request to download Llama 2 by filling out the form on the Meta website (https://ai.meta.com/resources/models-and-libraries/llama-downloads/), and those who want to try it out may wish to try it!

For more information see:

  • Paper address: https://scontent-lax3-2.xx.fbcdn.net/v/t39.2365-6/10000000_663429262362723_1696968207443577320_n.pdf?_nc_cat=101&ccb=1-7&_nc_sid=3c67a6&_nc_o hc=5ol-jUSglG4AX_EKgWk&_nc_ht=scontent-lax3-2.xx&oh=00_AfC4pQWErthyr1jwgSScKeyjXW3wwEUnqvIh7MNeb-Et3g&oe=64BBB691

  • Flame 2:https://ai.meta.com/llama/

  • Llama 2 application address: https://ai.meta.com/resources/models-and-libraries/llama-downloads/

  • Meta official announcement: https://about.fb.com/news/2023/07/llama-2/

  • Open Letter: https://about.fb.com/news/2023/07/llama-2-statement-of-support/

d412ebdfa05a7e24b3b169171706c80a.gif

6260e299b4b42b8df9d0d862a9f5926c.png

Guess you like

Origin blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/131820345