Speech Record丨Wang Haifeng: AI new infrastructure accelerates industrial intelligence

 2020-09-04 21:51:34

July 25-26, under the guidance of the Chinese Association for Science and Technology, the Chinese Academy of Sciences, the Chinese Academy of Engineering, the People’s Government of Zhejiang Province, the People’s Government of Hangzhou, and the Expert Committee on Artificial Intelligence Development of Zhejiang Province, organized by the Chinese Society of Artificial Intelligence, Yuhang District, Hangzhou The 2020 Global Artificial Intelligence Technology Conference hosted by the People's Government and undertaken by the Management Committee of Hangzhou Future Science and Technology City in Zhejiang Province was successfully held in Hangzhou, the "Digital Capital". In the keynote speech session of the conference held on July 25 , Baidu's chief technology officer and ACL/CAAI Fellow  Wang Haifeng brought us a wonderful speech entitled "AI new infrastructure to accelerate industrial intelligence" .

Speech Record丨Wang Haifeng: AI new infrastructure accelerates industrial intelligence

Haifeng Wang Baidu Chief Technology Officer, ACL/CAAI Fellow

The following is a record of Wang Haifeng’s speech:

The topic I share with you today is "New AI Infrastructure Accelerating Industrial Intelligence".

The new infrastructure promotes high-quality economic development through the construction of new infrastructure. Specifically, the new infrastructure includes information infrastructure, convergence infrastructure and innovation infrastructure. These new types of infrastructure are guided by new development concepts, driven by technological innovation, based on information networks, and oriented towards high-quality development, providing services such as digital transformation, intelligent upgrading, and integration innovation. AI is not only a new type of infrastructure that the new infrastructure will focus on, but also has important synergies with other new types of infrastructure.

At this stage, AI has become an important driving force for a new round of technological revolution and industrial transformation, and is leading human society into the fourth industrial revolution. Judging from the previous industrial revolutions in human history, its core technologies have strong versatility. Such as the mechanical technology of the first industrial revolution, the electrical technology of the second industrial revolution, and the information technology of the third. As the core driving force of the fourth industrial revolution, artificial intelligence also has strong versatility, has played an important role in many industries, and has the characteristics of standardization, automation and modular industrial production. AI has entered the stage of industrial production.

According to the IDC report, global data is expected to grow from 33ZB in 2018 to 175ZB in 2025, an annual growth rate of 27%; from 2006 to 2020, the computing performance of the chip has increased by more than 600 times; data blowout, computing power breakthrough, and deep learning Algorithmic innovations have jointly promoted the rapid development of AI. Therefore, a complete AI infrastructure must include algorithms, computing power, and data. The AI ​​infrastructure built by Baidu is Baidu Brain. After years of construction, Baidu Brain has developed into an AI large-scale production platform integrating software and hardware, with strong versatility, and with the characteristics of standardization, automation and modularity.

At the same time, we will export the basic AI technology capabilities of Baidu Brain to all walks of life through Baidu Smart Cloud to help the industry upgrade its intelligence. On the basis of Baidu Brain, Baidu Smart Cloud provides a multi-level platform for industrial applications, as well as industrial intelligent applications and industrial solutions.

Baidu Brain includes the basic computing power and data technology, and the algorithm platform with flying paddle as the core; the perception layer includes voice, vision, augmented reality and virtual reality technologies; the cognitive layer includes language and knowledge technology; in addition, there are A complete security system runs through all levels.

In terms of computing power, Baidu's data centers have covered more than 10 countries and regions around the world, forming a strong computing power foundation. At the same time, in response to the need to build AI infrastructure, Baidu has also self-developed AI general-purpose processor-Baidu Kunlun chip. The Kunlun chip is deeply integrated with the deep learning platform flying paddle, optimized for AI models such as voice, image and natural language processing, and its performance has been greatly improved.

The deep learning platform docks chips downwards and carries applications upwards. It is the operating system of the AI ​​era and the core foundation of AI infrastructure. Flying Paddle is my country's first fully functional industrial-level open source deep learning platform. It includes not only a core framework with leading technology and complete capabilities, but also a large number of fully verified industrial-level model libraries, as well as a wealth of development kits and tool components. The large number of leading technologies contained in the flying paddle deep learning platform can be summarized into four aspects, namely, the development of a convenient deep learning framework, super-large-scale deep learning model training technology, high-performance inference engines deployed on multiple terminals and multiple platforms, and a wealth of industry-level Open source model library.

First, let's look at the development of a convenient industrial-level deep learning framework. The flying paddle deep learning framework not only supports static graphs and dynamic graphs, but also can do automatic design of network structure. Flying Paddle's ultra-large-scale deep learning training platform can support distributed training of tens of billions of training data, hundreds of billions of features and trillions of parameter tasks, as well as distributed training of large-scale classification tasks of tens of millions of categories. All the trained models will eventually be deployed on each end for use by the application. Flying Oar's high-performance inference engine can support multi-terminal and multi-platform deployment, and can also seamlessly integrate with other deep learning frameworks. The supported platforms include almost all mainstream CPU and GPU platforms, as well as FPGAs, and support multiple operating systems. The general frame reasoning speed of the flying paddle is fully ahead. The deep learning framework is the core foundation of the deep learning platform. When developers actually use the deep learning platform, they often need various model libraries for its application scenarios. Therefore, the flying paddle also provides a very rich and industry-proven model library, covering almost every important application direction of AI. Such as computer vision model library, natural language processing model library, speech library and recommended library, etc. These libraries are divided into different levels, including algorithm layer and task layer, and also include end-to-end development kits. Now the flying paddle has more than 140 models and more than 200 industrial-grade pre-trained models.

The following introduces the core technical capabilities of Baidu Brain, including the capabilities of the perception layer and the cognitive layer. First look at voice technology. Baidu began to develop speech recognition technology in 2010, and began to use deep learning to develop speech recognition systems in 2012. It has developed a stream-type multi-stage truncated attention model, which has greatly improved the effect of speech recognition and developed An end-to-end model integrating acoustic enhancement and acoustic modeling. This voice recognition system can not only be used in some common application scenarios such as computers and mobile phones, but is also increasingly used in some far-field voice recognition scenarios, such as home scenes and car scenes. A microphone array is needed to process far-field speech. Therefore, the end-to-end recognition based on the integration of sound enhancement and acoustic modeling of the microphone array has been developed, which reduces the word error rate of speech recognition by 40%~50%, and the online product interaction is successful The rate has also increased substantially. Compared with speech recognition, speech synthesis is a bit later than embracing deep learning, but in recent years we have applied it on a large scale and made great progress. For example, the development from parallel wavernn to Gan-based ultra-high-definition wavernn speech synthesis technology. Traditional speech synthesis can only provide a small number of people's voices. Although these voices are clear and smooth, their style is relatively single. Through Baidu Brain’s latest speech synthesis technology, the timbre and style of the voice can be dynamically combined, so that the synthesized voice of a single style has the ability to migrate to multiple styles. Please listen to a few voices below. The first paragraph is the voice of a real person. With such a human voice, we can extract various features of its voice and synthesize various other voices, such as English. Everyone can hear that although the voice has changed from Chinese to English, the voice characteristics of this person are well maintained. Next, I will listen to two styles, storytelling and talk show. We can hear the sounds synthesized by Baidu's brain, all of which well maintain the original sound characteristics of real people. Based on such a voice synthesis technology with a thousand faces and a single face, we can synthesize the required sounds for various application scenarios. For example, in Baidu Maps, there is such an ability to support everyone with their own voice to navigate for themselves. For example, many users synthesize their children’s voices with Baidu’s brain, and listen to their children’s voices every day to navigate for themselves and go to get off work.

Baidu's visual technology layout is also very comprehensive, including images, videos, augmented reality and virtual reality, as well as various vertical categories. For example, when we recognize vehicles, characters, and human bodies, visual technology also requires software and hardware integration according to different application scenarios. Therefore, many visual AI SDKs and 3D structured light modules have also been developed. All these capabilities also provide services to the outside world through the open platform. Here we can see more visual techniques. For example, semantic analysis of video, comparative retrieval, identification of long video of film and television, and selection of video cover; also includes image similarity retrieval, general object scene recognition, and human-related live detection, key point positioning, card identification Wait.

Let's look at the cognitive layer again. Language and knowledge are related to human cognition. Knowledge is the crystallization of human wisdom. The precipitation and inheritance of knowledge supports the continuous progress of mankind. Language is not only a tool for human communication, but also a carrier of knowledge accumulation and inheritance. From the perspective of AI technology, language is closely related to knowledge technology. Baidu's language and knowledge technologies include knowledge graphs, language understanding, language generation, and various applications such as intelligent search, in-depth question and answer, machine translation, dialogue systems, and intelligent writing. The most basic is the knowledge graph. In order to mine knowledge from huge data, including Internet big data, and various industry data, large-scale knowledge mining is required, and knowledge integration and knowledge completion are also required. On this basis, we have constructed a huge knowledge graph; based on these graphs, further inference, calculation and retrieval of knowledge. Based on multi-source heterogeneous big data, we have constructed the world's largest knowledge graph. Baidu's knowledge graph now has more than 5 billion entities and more than 550 billion facts. Such a huge knowledge graph includes not only a basic entity graph composed of entity attributes, but also graphs required for various applications, such as POI graphs, industry graphs, and video understanding graphs.

Language is the basic tool for human communication, so one of the important tasks of AI is to understand language. After decades of development, the understanding of natural language has gradually entered the stage of semantic understanding from morphology and syntax. The knowledge-enhanced semantic understanding framework developed by Baidu Wenxin (ERNIE) has two very important capabilities. One is to combine deep learning with knowledge to create a knowledge-enhanced semantic understanding framework; the other is such a semantic understanding The framework can also carry on continuous learning, so that the ability of language understanding is continuously improved. The light blue line in this picture is the current best result. On this basis, we use the knowledge-enhanced semantic understanding framework, and the effect has been significantly improved. Further, based on this, continue to increase knowledge, such as increasing dialogue knowledge, text structure knowledge, web page knowledge, and increasing semantic relationship knowledge. It can be seen that ERNIE's semantic understanding capabilities will continue to be enhanced. This is a very important feature. With such a feature, we can continuously improve AI's language understanding ability, and continuously enrich data in practical applications, with better and better results. The application effect below shows that the original application effect has been greatly improved in different applications. ERNIE not only understands natural language, but also combines with other perception technologies to form cross-modal semantic understanding. For example, here is the combination of ERNIE and computer vision technology, released the industry's first cross-modal pre-training model ERNIE-ViL that integrates scene graph knowledge, which is used in the visual common sense reasoning task (VCR) of the authoritative list of cross-modal fields Get first.

The previous introduction is the core technology of Baidu Brain, the AI ​​infrastructure created by Baidu. On the basis of the technology platform of Baidu Brain, we use Baidu Smart Cloud to export Baidu's AI capabilities to all walks of life, assist in the intelligent upgrading of all walks of life, and accelerate the process of industrial intelligence. This is a panorama of Baidu Smart Cloud, including platforms, industry smart applications and industry solutions.

The basic cloud platform is the foundation of Baidu's intelligent cloud. It gives full play to its technical advantages and practice accumulation to form a complete product and service system that supports various upper-level applications. The first is flexible infrastructure, including large-scale data centers, high-performance computing and storage, self-developed chips, and high-performance networks. On top of this, a full range of products has been formed, including cloud hosting, cloud storage, cloud security, etc., to provide customers with solutions and services that cover all scenarios, cost-effective, easy operation and maintenance, security compliance, and high flexibility.

Next, I will talk about AI middle station. First, the intelligent upgrade of enterprises requires AI, and AI also needs to be deeply integrated with enterprise application scenarios, but many enterprises lack the basic capabilities and platforms of AI. To this end, we have customized an enterprise-specific AI platform based on Baidu Brain to intensively manage the company's AI capabilities and coordinate the intelligent upgrade of the company. This includes AI's capability engine, AI development platform, and basic management capabilities such as data management, service management, authority management, resource management, and operation and maintenance management. For example, Baidu Smart Cloud’s AI middle station built for State Grid Shandong Electric Power has supported Shandong Electric Power’s construction machinery detection, heavy smoke and wildfire detection, wire foreign body detection, and VIP customer identification. Electricity provides good support for business intelligence.

When I introduced the Baidu brain earlier, I mentioned knowledge. We all know that books are the ladder of human progress, and here I want to say that knowledge is the ladder of AI progress. Every company has its own unique knowledge, but at the same time, many companies often lack the ability to construct and apply knowledge. Therefore, Baidu Smart Cloud has also tailored a knowledge platform for enterprises to help enterprises upgrade their intelligence. The knowledge center can help companies condense knowledge, empower business, and then help companies upgrade their intelligence. Knowledge Center is based on various core technologies of AI, and its core capabilities include knowledge production, knowledge processing and knowledge application. On this basis, a product matrix related to enterprise knowledge can be formed, such as enterprise search, intelligent recommendation, intelligent knowledge base, industry knowledge graph and decision engine.

The AI ​​middle station and knowledge middle station tailored for enterprises can support intelligent upgrades in all aspects of enterprise operations. It also includes the intelligent upgrade of the office. As we all know, the previous industrial revolutions brought the assembly line of industrial production to mankind, and the assembly line greatly improved the efficiency of industrial production. In the AI ​​era, based on the AI ​​middle station and knowledge middle station, it is also possible to create an office assembly line in the AI ​​era, thereby greatly improving office efficiency in the intelligent era. Usually a smart office platform, I think it should include the following pipelines, for example, it can support communication flows between colleagues, communication groups, corporate address books, etc., support scheduling, meetings, organization, schedule reminders, project settings, and progress tracking , Collaboration, etc., and the knowledge flow of retrieval, recommendation, and Q&A of rich knowledge within the enterprise. Baidu's new generation of smart office platform is Ruliu, and we hope that with Ruliu's support, office in the AI ​​era will be as smooth as a cloud of water. Shown here is the style of workflow, communication flow and knowledge flow of Ruliu products. With the support of these streams, our work efficiency is rapidly improving.

The city we live in is also an important scenario for AI empowerment. The smart city solution provided by Baidu Smart Cloud includes the city-aware middle station, which can collect multi-source data and all-element mapping; also includes the city AI middle station, which is responsible for city algorithms, computing power scheduling, and operation management; it also includes cities The data center is responsible for the integration, governance and analysis of urban data. Of course, there are also important urban interactive platforms, including a map of time and space, and smart one-click search. On this basis, it supports various smart application scenarios in the city, such as emergency management, city management, public safety, smart transportation, smart parks, smart education, etc. We hope that the smart city solutions provided by Baidu Smart Cloud can make cities safer, more calm, smooth and livable.

These are some examples of Baidu's urban brain applications. Based on Baidu map integration road network construction site and satellite data, the driving trajectory of the muck truck can be tracked, making the management of the muck truck more efficient. There is another very important application scenario in the city-transportation. Baidu's smart city solutions also include smart transportation solutions, such as smart information control and vehicle-road collaboration. These programs are also based on various technologies such as Baidu Brain's flying paddle, and data from various sources. By processing these data, it can ensure efficient operation at the business level. Give some examples of other application scenarios. For example, in the meteorological field, based on the meteorological big data platform and the BML machine learning platform provided by Baidu Brain, we create a meteorological brain specifically applied in the meteorological field to support various applications of meteorology; such as weather forecasts, flood forecasts, and the use of meteorology Satellite monitoring of fire points, etc.

In the financial field, we also provide the basic financial platform, including the aforementioned AI middle station, knowledge middle station, and the financial cloud, financial distributed database, financial graph database, blockchain, etc. required by the financial field. Based on such a basic financial platform, it supports various smart financial applications, such as smart risk control, smart marketing, digital employees and risk identification, risk prediction, risk pricing, etc., to form a complete smart financial solution. In terms of customer acquisition, risk control and operation, it provides a good intelligent solution for finance. Baidu Smart Cloud's smart finance has served about 200 financial customers, applied to more than 10 financial scenarios, and, together with more than 30 partners, is committed to providing the best services to financial customers.

Medical care is also a field that is very relevant to each of us, and improving the level of medical care can undoubtedly benefit the entire society. The smart medical system created by Baidu Smart Cloud includes medical AI middle stage and medical knowledge middle stage. The medical AI middle station includes core technologies such as medical imaging screening, disease-assisted diagnosis, treatment plan recommendation, medical record structuring, medical voice recognition, and doctor order quality control; while the knowledge middle station includes medical knowledge maps and medical knowledge bases. Based on the medical AI platform and knowledge platform, we can provide various medical intelligent applications, such as fundus screening, new coronary pneumonia screening, clinical decision-making assistance, rational drug use, medical record quality control, and chronic disease management, covering screening, Diagnosis, management and other aspects. We hope to use evidence-based AI to radiate new vitality in medical care. Baidu Smart Healthcare has grown from technology to a large-scale application stage, covering 27 provinces, municipalities, and autonomous regions across the country, more than 1,500 primary medical institutions, and serving more than 25 million people.

Baidu Smart Cloud also plays a role in the field of industrial manufacturing. Through the empowerment of AI and data centers, it supports the intelligent upgrade of manufacturing and provides intelligent quality inspection, process optimization, and production scheduling. The industries covered by AI applications include steel, water, electricity, 3C and automobiles, etc., helping these industries to innovate to improve quality and reduce costs. For example, industrial quality inspection can realize the inspection of 3C small parts and components. These inspection capabilities have been implemented in many factories on a large scale, including the inspection of notebook casings, and have also been implemented in many notebook factories on a large scale. Another example is the inspection of vehicle lights in the final assembly. Inspectors need to avoid interference from external light sources to the inspection of vehicle lights, and the detection cycle is short, and the detection speed is very demanding. AI capabilities play a very important role here. Another example is the industrial safety inspection. Whether the workers in the factory meet the safety regulations, such as whether they wear helmets, whether they go to areas that should not be visited, etc. can be accurately detected; and for crane wires, foreign objects, fireworks, tower cranes, and construction Intelligent detection of multiple types of complex scenes such as machinery has high accuracy and has been widely used.

In response to the security threats and challenges faced by AI industrialization, Baidu continues to carry out security technology innovation and engineering practices, and continuously improves the integrated security system. The complete security system created by Baidu includes AI model security, cloud native security, end-side cloud integration security, data security and privacy protection, industry application security solutions, industry ecological security, etc.; at the same time, these security capabilities are output through Baidu Smart Cloud, Escorting the intelligent upgrading of the industry.

Build AI infrastructure and accelerate industrial intelligence. Baidu joins hands with all walks of life to promote "new infrastructure" and create a new future together!

Guess you like

Origin blog.csdn.net/weixin_42137700/article/details/108640051