The parameters are nearly 6 times that of ChaGPT! Intel announces Aurora genAI, a large AI model, with 1 trillion parameters

   Organize | Ling Min, Nuka-Cola

   Source | AI Frontline ID | ai-front

Are bigger model parameters better?

Intel announces AI model Aurora genAI

With 1 trillion parameters

According to wccftech reports, Intel recently announced its generative AI model Aurora genAI.

It is reported that the parameter volume of Aurora genAI is as high as 1 trillion, and its development relies on the Megatron and DeepSpeed ​​frameworks, which enhance the strength and capacity of the model. The ChatGPT model parameter volume is 175 billion, which also means that the parameter volume of Aurora genAI is nearly 6 times that of ChatGPT .

It is reported that the Aurora genAI model was developed by Intel in cooperation with Argonne National Laboratory and HPE. It is a purely science centric generative AI model that will be used in various scientific applications, including molecular and material design, and even Comprehensive knowledge material covering millions of sources, based on which provides experimental design ideas worth exploring for systems biology, polymer chemistry, energy materials, climate science, and cosmology. These models will also be used to accelerate the identification of relevant biological processes in cancer and other diseases, and to provide target recommendations for drug design.

In addition to scientific research, Aurora genAI also has application potential in commercial fields such as natural language processing, machine translation, image recognition, speech recognition, and financial modeling.

Rick Stevens, deputy director of Argonne Laboratories, said, "This project hopes to fully utilize the full potential of the Aurora supercomputer to provide resources for downstream scientific research and other interagency cooperation programs in various laboratories of the Department of Energy."

According to the introduction, the Aurora genAI model will be trained from conventional text, code, scientific text and structured data in biology, chemistry, materials science, physics, medicine and other disciplines. Argonne Labs is spearheading an international collaboration to advance the project, with participants including Intel, HPE, Department of Energy laboratories, U.S. and other international universities, nonprofit organizations, and international partners such as RIKEN.

The Aurora genAI model will run on the Aurora supercomputer developed by Intel for the Aragon National Laboratory, with a performance of 20 billion billion billion operations, twice that of Frontier, the current TOP500 supercomputing champion. Recently, Intel and Argonne National Laboratory also announced the installation progress, system specifications and early performance test results of Aurora:

  • Intel has completed the delivery of more than 10,000 blade servers for the Aurora supercomputer.

  • Aurora's complete system uses the HPE Cray EX supercomputing architecture and will have 63,744 GPUs and 21,248 CPUs, supplemented by 1,024 DAOS storage nodes. Aurora will also feature HPE Slingshot high-performance Ethernet networking.

  • Early performance results show that the Aurora supercomputing system has leading performance on practical scientific and engineering workloads, with 2 times higher performance than AMD MI250 GPU, 20% higher performance than H100 on QMCPACK quantum mechanics applications, and the ability to Nearly linear computing power expansion is maintained on hundreds of nodes. As a strong competitor of ChaGPT, the announcement of Aurora genAI indicates that the AI ​​large-scale model track has ushered in a new heavyweight player, and it is very likely to have a significant impact on various scientific fields in the future. But at present, Aurora genAI is more like a concept stage, and Intel aims to complete the construction of the Aurora genAI model by 2024.

Regarding Intel's trillion-parameter AI model Aurora genAI, some netizens said: "I don't believe that just increasing the number of parameters can improve the model. I don't think we should issue a press release chasing the increase in the number of parameters. Larger models generally don't perform better, but this is becoming increasingly difficult to explain to non-technical people due to irresponsible marketing. If we let this marketing go unchecked, we'll disappoint a lot of people and lower everyone's Confidence in AI's future growth potential - we don't want another AI winter. There is a huge environmental cost to training these large models, and it becomes more difficult to understand, use, and control (even as a researcher) these very large models .”

The AI ​​arms race enters

"Trillion parameter model" against the era?

In recent years, as the AI ​​large-scale model track continues to heat up, more and more technology giants have joined in, and continue to break the parameter scale record.

In January 2021, the Google Brain team launched the super language model Switch Transformer, which has 1.6 trillion parameters and was the largest NLP model at the time. In June of the same year, Zhiyuan Research Institute released Enlightenment 2.0. The number of system parameters exceeded 1.75 trillion, making it the world's largest large-scale intelligent model system at that time. In November of the same year, Alibaba Dharma Institute released the multi-modal large model M6, whose parameters have jumped from trillions to 10 trillions, making it the largest AI pre-training model in the world at that time.

Some analysts pointed out that the core battlefield of the Sino-US AI arms race is the trillion-level pre-training model. Creating a pre-training model with a scale of peta-parameters is a super project of mankind, which may have a major impact on the country and even human society.

So, is the larger the model parameter, the better?

Xiang Yang, deputy director of the Cloud Computing Institute of the Network Intelligence Department of Pengcheng Lab, once pointed out in an interview with InfoQ:

Some of the models we saw at first had tens of thousands of parameters, and later they reached hundreds of millions, billions, tens of billions, hundreds of billions, and possibly trillions. At present, from the facts, it is true that the larger the model, the more data, and the better the quality, the higher the performance. But I personally think that this improvement curve may have a bottleneck period. When it reaches the bottleneck or plateau period, its rising speed may be slow, or it may basically reach stability. For now, maybe we haven't reached a plateau yet. Therefore, the saying "the larger the model parameters, the better" is true to a certain extent.

However, to judge whether a large model is excellent, we should not only look at the parameters, but also look at the actual performance. If the task effect of the model is good, we can consider this model to be a good model. The parameters are not a problem. When the machine is strong enough both in terms of storage and computing power, the large model can also be turned into a small model.

In addition, the interpretability of the model and its vulnerability to noise must also be considered. If the model has certain explanatory power, then this model is a good model; if the model is not easily affected by noise data or other factors, then this model is also a good model.

Reference link:

https://www.businesswire.com/news/home/20230522005289/en/Intel%E2%80%99s-Broad-Open-HPCAI-Portfolio-Powers-Performance-Generative-AI-for-Science

https://www.infoq.cn/article/EDSy8OCKbCRc9TiB48PI

https://www.infoq.cn/article/LBhYJYr63GysNIKZCgzI

 

Guess you like

Origin blog.csdn.net/lqfarmer/article/details/130892961