Baidu CTO Wang Haifeng: Large language models bring the dawn of general artificial intelligence

On August 16, the WAVE SUMMIT Deep Learning Developer Conference 2023, hosted by the National Engineering Research Center for Deep Learning Technology and Applications, was held in Beijing. Wang Haifeng, chief technology officer of Baidu and director of the National Engineering Research Center for Deep Learning Technology and Applications, gave a keynote speech. Wang Haifeng stated for the first time that the large language model has the core basic capabilities of artificial intelligence such as understanding, generation, logic, and memory, bringing the dawn of general artificial intelligence.

The number of developers of flying paddles has reached 8 million, and the number of models exceeds 800,000

WAVE SUMMIT deep learning developer conference started in April 2019. Wang Haifeng proposed at the first conference that deep learning has strong versatility and has the characteristics of standardization, automation and modularization of industrial mass production, promoting artificial intelligence to enter the stage of industrial mass production. Over the past four years, the development of deep learning technology and applications has fully verified this point of view. Deep learning technology is becoming more and more versatile, and the standardization, automation, and modular features of deep learning platforms are becoming more and more prominent. The rise of large pre-trained models has further expanded the depth and breadth of artificial intelligence applications. Artificial intelligence has entered the stage of industrial mass production.

In terms of standardization, the framework and model are jointly optimized, multiple hardware is uniformly adapted, and the application model is simple and efficient, which greatly reduces the threshold for artificial intelligence application; in terms of automation, from training, adaptation, to inference deployment, it improves the efficiency of the entire artificial intelligence research and development process; modularization On the other hand, a rich industrial-level model library supports the convenient application of artificial intelligence in a wide range of scenarios.

7be9df9a81ab7bf4b1d7ecdd3e89907f.jpeg

It is understood that thanks to the mutual promotion of Fei Paddle's industrial-level deep learning open source open platform and Wenxin's large model, Fei Paddle's ecology has become increasingly prosperous. It has gathered 8 million developers and served 220,000 enterprises and institutions. Based on Fei Paddle, 80 Thousands of models. Wang Haifeng explained the meaningful meaning of the Chinese name of the flying paddle developer community AI Studio "Galaxy Community", "With a sincere heart and a flying paddle, we can sail to the stars." Together with all developers, with the support of Fei Piao and Wen Xin, we will build the Galaxy community and go to the starry sea of ​​general artificial intelligence.

f996af73867c6682a0a23d83455c624f.jpeg

Large language models bring the dawn of general artificial intelligence

Wang Haifeng said that artificial intelligence has a variety of typical abilities, of which understanding, generation, logic, and memory are the core basic abilities. The stronger these four abilities are, the closer they are to general artificial intelligence. The large language model has these four abilities and provides General artificial intelligence brings the dawn.

Specifically, the typical abilities of artificial intelligence, such as creation, programming, problem solving, planning, etc., all rely on core basic abilities such as understanding, generation, logic, and memory, with varying degrees of dependence. Taking problem solving as an example, from reading the question, solving the question to finally writing the answer, it requires the comprehensive use of understanding, memory, logic and generative abilities.

How to obtain these abilities? Taking Wen Xinyiyan as an example, we first obtain a pre-trained large model through fusion learning from trillions of data and hundreds of billions of knowledge. On this basis, we use supervised fine-tuning, human feedback reinforcement learning and prompts and other technologies, and have Technical advantages such as knowledge enhancement, retrieval enhancement and dialogue enhancement.

Furthermore, through multiple strategies to optimize data sources and data distribution, basic model long text modeling, multi-type and multi-stage supervised fine-tuning, multi-task adaptive supervised fine-tuning, multi-level and multi-granularity reward models and other technological innovations, comprehensive Improve basic general abilities. On the basis of retrieval enhancement and knowledge enhancement, the mastery and application of world knowledge can be improved through knowledge point enhancement; logical capabilities can be improved through large-scale logical data construction, logical knowledge modeling, multi-granular semantic knowledge combination and symbolic neural network; Ensure the security of large models by building a comprehensive security system for data, content, model and system security.

In terms of efficiency, through Fei Paddle's end-to-end adaptive hybrid parallel training technology and collaborative optimization of compression, inference, and service deployment, the training speed of Wenxin's large model has reached 3 times, and the inference speed has reached more than 30 times.

In terms of application, data-driven, prompt construction, and plug-in enhancement are used for scene adaptation and collaborative optimization. Wen Xin Yi Yan has launched five major plug-ins: Baidu Search, Browsing Documents, E-Word Easy Pictures, Shuo Tu Jie Hua, and Yijing Liuying, enabling the model to generate real-time and accurate information, long text summaries and questions and answers, data insights and chart production, based on Ability to create pictures, question and answer, and create videos. The plug-in mechanism expands the capabilities of large models and is more adaptable to scene needs. Wang Haifeng said that in the future, Baidu will build a plug-in ecosystem with developers and share technological innovation results.

Artificial intelligence represented by large language models is penetrating into thousands of industries, accelerating industrial upgrading and economic growth. In this process, technological innovation and application implementation form a virtuous cycle, capabilities such as understanding, generation, logic, and memory continue to improve, the breadth and depth of industrial applications continue to expand, and large language models bring the dawn of general artificial intelligence.

Guess you like

Origin blog.csdn.net/ZabeNbRdit36243qNJX1/article/details/132331507
Recommended