Huawei’s large model is finally here, and my evaluation is: quite shocking

Huawei’s large model is finally here, and my evaluation is: quite shocking


Huawei, which has always been said to be lagging behind in the large-scale model competition, finally brought its troubles this time.


No, at yesterday's Huawei Developer Conference 2023, Huawei showed off its capabilities.


The nearly three-hour press conference still inherited Huawei's past hodgepodge style, and Shichao was dazzled by it.


However, in summary, one theme actually stands out: Pangu Large Model 3.0.


In fact, just a few days ago, when other large models were still competing for various ratings, Pangu entered everyone's field of vision in a unique way with the golden sign of being certified by the world's top journal Nature.


It is said that with the addition of the Pangu model, the speed of weather prediction has increased by more than 10,000 times, and the results can be obtained in a few seconds. It can clearly predict where the typhoon will hit, what time it will come, and when it will leave.


The most important thing is that its prediction accuracy even exceeds the IFS system of the European Meteorological Center, which is known as the world's strongest. It is the first product whose AI predictions beat traditional numerical predictions.


You know, in the past, most AI weather predictions were developed based on 2D neural networks, but weather is too complicated, and 2D is really too much.


Moreover, previous AI models will continuously accumulate iterative errors during the prediction process, which can easily affect the accuracy of the results.


Therefore, AI prediction methods have always been unpopular.


The Pangu weather model is outstanding. They use a three-dimensional neural network called 3DEST to process weather data. If 2D cannot do it, then use 3D.


Network training and inference strategies for 3DEST


In response to the problem of iteration errors, the model also uses a "hierarchical time domain aggregation strategy" to reduce iteration errors, thereby improving the accuracy of forecasts.


Although this word sounds easy to be fooled, it is actually easy to understand.


For example, the previous AI weather prediction model FourCastNet will predict 6 hours in advance before a typhoon comes. During these 6 hours, the model will calculate multiple times when the typhoon will come.


It may be calculated as 5 hours at one time, and 4 and a half hours at another time. When these results are added together, the error will be large.


But the Pangu Meteorological Model found a way to train four models with different forecast intervals, with one iteration per hour, and one iteration per three hours, six hours, and 24 hours.


Then according to the specific weather prediction needs, the corresponding model is selected for iteration.


For example, if we want to predict the weather in the next 7 days, let the 24-hour model iterate 7 times; predicting 20 hours means 3 iterations of the 6-hour model + 2 iterations of the 1-hour model.


The fewer iterations, the smaller the error.


This wave of operations has taken weather forecasting to a new level.


However, some friends may have begun to mutter, other people's large models only generate images and text, how come Huawei has turned it into a weather forecast?


Let’s talk about it. This large Pangu model is indeed different from the ChatGPT and Midjourney we have come across before. They are engaged in industry business.


To put it simply, we generally cannot use the large Pangu model.


It is not the "nemesis" of ChatGPT that everyone is looking forward to, but is aimed at the To B market that is rarely accessible.


Let’s not talk about whether it is difficult or not. At least the corporate customer resources Huawei has accumulated over the years are indeed easy to monetize.


Moreover, Huawei’s press conference not only brought a ruthless role of weather prediction model.


No new antibiotics have been discovered in more than 40 years. The super antibacterial drug Drug


The large model of Pangu Mine can also go deep into more than 1,000 processes of coal mining, and just the selection of clean coal can increase the recovery rate of clean coal by 0.1% to 0.2%.


You know, for a coal preparation plant with an annual output of 10 million tons of coking coal, every 0.1% improvement in clean coal yield can generate an additional 10 million yuan in profits per year.


This is all free money. . .


In fact, in addition to the weather prediction, drug development and coal preparation mentioned above, the Pangu model has been used in many industries.


At the press conference, Tian Qi, chief scientist of Huawei Cloud Artificial Intelligence, said that Huawei Cloud Artificial Intelligence projects have been applied in more than 1,000 projects, 30% of which are used in customers' core production systems, driving an average of 18% improvement in customer profitability. %.


Huawei's ability to mass-produce these large models for different industries is due to the 5+N+X three-layer architecture of Huawei Pangu Model 3.0.


It is this structure that allows Pangu to quickly enter various industries.


Why do you say that?


Because AI is implemented in industries, data is a major difficulty.


Zhang Ping'an said at the press conference, "Due to the difficulty in obtaining industry data and the difficulty in integrating technology and industry, the implementation of large models in the industry has been slow. "


Pangu is very clever. Through the three-layer architecture of 5+N+X, it directly breaks this big problem into three small problems to solve.


First, the five large models of Pangu's L0 layer learned hundreds of terabytes of text data such as encyclopedia knowledge, literary works, program codes, and billions of Internet images with text labels.


We can understand it as, first let the first layer L0 large model (the five basic large models of natural language large model, visual large model, multi-modal large model, prediction large model, and scientific computing large model) establish a basic understanding. Knowledge, which is a bit like the quality education stage before our university.


Then, the model in the second layer L1 is formed by letting a basic large model in L0 learn data from N related industries. This is like the undergraduate stage of university, where you need to choose various majors to study.


For example, CT image inspection in hospitals and image quality inspection in factories both use large visual models.


But after all, one is a hospital and the other is a factory. The usage scenarios are completely different. It will definitely not work based on the basic large model, but if industry data is added, there may be surprises.


The final L2, similar to a graduate student, will be refined to a certain scenario based on specific industries. For example, in the warehousing and logistics industry, different deployment models may be required for the transportation, entry, and exit of goods.


At the same time, Huawei has also added a feedback link, which is a bit like an internship in the company.


According to them, in the past, it usually took 5 months to develop a large industry model on the scale of GPT-3; with this set of tools, the development cycle can be shortened to 1/5 of the original time.


At the same time, the limitations of small data sets in many industries can also be solved. For example, a very detailed industry such as building large aircraft can also have large models.


In addition to this large set of models, Huawei also proposed a very interesting thing this time - localization of computing power.


As we all know, we are really embarrassed in terms of AI computing power.


Firstly, we cannot buy NVIDIA's H100/A100, the core equipment in the AI ​​industry. Secondly, even though NVIDIA has "thoughtfully" replaced the H800, it still has reservations. For example, the transmission rate has been reduced a lot.


Under the background that large models often require several months of training time, it is easy to be overtaken by foreign counterparts with stronger computing power.


But this time, Huawei still came up with some real stuff to deal with this problem.


For example, in terms of paper performance, Huawei's Ascend 910 processor is already comparable to Nvidia's A100.


However, in practical application, there are still some gaps. And the A100 is not Nvidia’s ultimate weapon.


However, Shengteng has been recognized by many friends. Huawei even directly stated at the press conference that "half of the computing power of China's major models is provided by them."


Of course, Huawei's current highlight in computing power is more likely to be brought about by the entire software ecosystem.


For example, according to the press conference, AI Shengteng cloud computing power base and computing framework CANN are included. . . In terms of training large models, Huawei's efficiency is 1.1 times that of mainstream GPUs in the industry.


In addition, they have developed a complete set of application packages for users.


For example, Meitu migrated 70 models to the Huawei ecosystem in just 30 days. At the same time, Huawei also said that with the efforts of both parties, AI performance has improved by 30% compared with the original solution.


Still quite impressive.


Moreover, Huawei also said that they now have nearly 4 million developers, which is in line with the NVIDIA CUDA ecosystem.


This series of actions can be regarded as making up for part of the shortcomings.


In general, after watching a Huawei press conference, reviewers felt that Huawei's layout in AI is very profound. They have already begun to think about the question of "what AI can really bring us."


In the past six months, although the AI ​​industry has received thunderous applause, it has been somewhat embarrassing when it really comes down to the industry level.


And Huawei’s action just confirms what Ren Zhengfei said:


"In the future, there will be a turmoil in AI large models, and it is not just Microsoft. The direct contribution of artificial intelligence software platform companies to human society may be less than 2%, and 98% is the promotion of industrial society and agricultural society. "


In the field of AI, the real big era is yet to come.

Reprinted from: https://www.toutiao.com/article/7253266503217218082

Guess you like

Origin blog.csdn.net/davidwkx/article/details/131633026
Recommended