With only 2.7 billion parameters, Microsoft releases a new Phi-2 model!

e2c889b59a8dbc4b0a31f72221dcb986.gif

Organizing | Su Mi

Listing | CSDN (ID: CSDNnews)

Microsoft, which has joined hands with OpenAI and Meta to promote the development of large models, is also accelerating the iteration of its own small models. Just today, Microsoft officially released a 2.7 billion parameter language model-Phi-2. It is a text-to-text artificial intelligence program with excellent reasoning and language understanding capabilities.

d2c6795fd278b3f8efb866b2012f75cb.jpeg

At the same time, Microsoft Research also said on the official X platform, "Phi-2's performance is better than other existing small language models, but it is small enough Run on your laptop or mobile device".

40099ee8416b15bf7320e8de32f7f127.jpeg

88466c10f72d6e5baaa8acb70dcdff31.png

Can Phi-2 really outperform a model 25 times larger?

Regarding the release of Phi-2, Microsoft Research bluntly stated at the beginning of the official announcement that the performance of Phi-2 can match or exceed that of models that are 25 times larger.

This is also a bit embarrassing. Many netizens commented, doesn’t this easily surpass the smallest version of Gemini just released by Google?

435366ca1f9e54cc3c4033ab168809ec.png

So what is the specific situation?

Microsoft has adopted some of the current methods such as Big Bench Hard (BBH), common sense reasoning (PIQA, WinoGrande, ARC easy and Challenge, SIQA), language understanding (HellaSwag, OpenBookQA, MMLU (5-shot), SQuADv2, BoolQ), mathematics (GSM8k) and encoding (HumanEval), comparing Phi-2 with Mistral and Llama-2 with 7B and 13B parameters.

The result is Phi-2, which has only 2.7 billion parameters, surpassing the performance of the Mistral 7B and Llama-2 7B and 13B models. Notably, Phi-2 also achieves better performance on multi-step inference tasks (i.e., coding and mathematics) compared to the Llama-2-70B model, which is 25 times larger.

7117327eca8ff7a73c8872c1d3d37e2a.png

In addition, as mentioned above, Microsoft researchers also directly put the results of its front-side PK with Google's newly released Gemini Nano 2 in the benchmark test. As expected, Phi-2's performance is still the same despite its smaller size. It surpasses the Gemini Nano 2.

f66422f24131584275ebaf0a9aa5b567.png

In addition to these benchmarks, the researchers seemed to be insinuating that Google had fabricated Gemini demonstration videos a few days ago because at that time Google says its upcoming largest and most powerful new artificial intelligence model, Gemini Ultra, can solve fairly complex physics problems and even correct students' mistakes.

It turns out that even though the Phi-2 may be a fraction of the size of the Gemini Ultra, it's also capable of answering questions correctly and correcting students using the same prompts.

007259ce44ece9187a255cbb668f949b.png

1d132839c21ffacc331846d2f743b351.png

Microsoft improvements

Microsoft Research explained the reasons why the Phi-2 small model has such outstanding results in a blog.

The first is to improve the quality of training data. Phi-2 is a Transformer-based model with the goal of predicting the next word, which is trained on 1.4T phrases from NLP and encoded synthetic datasets and web datasets, including science, daily activities and Theory of mind, etc. are used to teach model common sense and reasoning. Training on Phi-2 took 14 days on 96 A100 GPUs.

Second, Microsoft expanded using innovative technology to embed its knowledge into the 2.7 billion parameter Phi-2.

Microsoft notes that Phi-2 is a base model that has not been tuned through reinforcement learning with human feedback (RLHF) or guided fine-tuning. Nonetheless, Microsoft observed better performance of Phi-2 in terms of toxicity and bias compared to existing aligned open source models.

93c24c1f25947ed0d37e7047a5688dcd.png

c57c8c372fad9ab8ae9b9d7585f5896f.png

write at the end

It is said that the release of Phi-2 has indeed achieved a breakthrough in the performance of small models, but some media have discovered that it still has great limitations.

Because according to the Microsoft Research License, it stipulates that Phi-2 can only be used for "non-commercial, non-revenue-generating, research purposes" and not for commercial purposes. As a result, businesses that want to build products on top of it are out of luck.

Source: https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/

4d9d70b4aad168c48e6d14df51282c42.gif

Guess you like

Origin blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/135007434