Comprehensive intelligence, Huawei’s “hard work” | Full Connectivity Conference 2023

Click to follow

Text丨 Yao Yue Hao Xin, editor丨Wang Yisu

On September 20, Huawei announced "comprehensive intelligence", and the technology giant's strategic steps have entered a new stage.

Huawei Vice Chairman, Rotating Chairman, and CFO Meng Wanzhou mentioned at the All Connected Conference 2023 that Huawei’s all-intelligence (All Intelligence) strategy will build a solid computing power base based on continued deep cultivation of AI root technology to provide services to the world. Build the second choice, enable hundreds of models and thousands of industries, and empower thousands of industries.

The world is accelerating into the fast lane of digital intelligence. How is Huawei's comprehensive intelligence strategy different?

  • This smart strategy not only looks up to the stars, but also keeps its feet on the ground. The emphasis on the construction of basic technologies and facilities such as AI and computing power is equally important as accelerating "industry implementation";
  • In terms of intelligent core business, it can be roughly divided into two parts: "hardware" and "software". The ICT infrastructure and enterprise BG sectors led by Wang Tao, and the Huawei Cloud sector led by Zhang Ping'an are responsible for implementation. ;
  • In terms of research and accumulation of basic technologies, Huawei is the company with the most comprehensive accumulation in China, covering almost the entire intelligent industry chain from perception and connection to computing and AI. This year, in terms of core technologies, special emphasis is placed on the construction of underlying computing power, the improvement of AI root technology, and the improvement of Huawei Cloud system-level capabilities;
  • In terms of industry implementation, Huawei has released intelligent solutions for nine major industries. While passing through the industry corps, focusing on "blasting", at the same time, through Huawei Cloud, it uses the Pangu model to promote industry implementation. The two lines run in parallel to accelerate the deepening of industry intelligence;

Reviewing the entire Huawei strategy, we can see that the emergence of Kirin chips, Ascend computing clusters, and Pangu large models is not accidental, but at the same time, Huawei is gathering "all forces that can be united" with an extremely open mind, and the platform The ambition and the grand plan of intelligentization are vividly displayed on the paper.

From 2016, Huawei comprehensively elaborated on the strategic positioning of Huawei Cloud for the first time at the first All-Connect Conference and started the road to digital transformation; to 2018, Huawei officially released its AI strategy and full-stack, all-scenario AI solutions; and then to the digitalization of the industry in 2020 , With the explosion of intelligence, Huawei has achieved all-round and multi-level connections of people, things, and information through the "5 machines" collaboration of connection, cloud, computing, AI, and applications.

Entering from the communications side, as a "latecomer" in AI, cloud computing technology and industrial layout, Huawei is coming with great momentum. Light Cone Intelligence will be divided into two parts: "Hard Kung Fu" and "Soft Power" to interpret Huawei's Full Connectivity Conference 2023. If you understand Huawei's intelligent strategy, you will also understand the future of the intelligent industry.

It doesn't matter where you come from, it's where you go that matters.

(Huawei Industry Intelligent Reference Architecture)

01 How to consolidate the computing power base without blindly stacking cards?

With the explosion of big models, global intelligence is being "stuck" by computing power. The importance of computing power is self-evident, and Huawei, who has experienced the pain, understands the key.

This time, Huawei released a comprehensive intelligence strategy. The most important step is to strengthen the capabilities of the computing base.  Ascend AI computing cluster announced another upgrade - the release of the Atlas 900 SuperCluster computing cluster, which can support large model training with over one trillion parameters. Judging from the upgrade status of Atlas 900 SuperCluster, Huawei is clearly gearing up to embrace the "10,000-ka era" of computing clusters.

In 2019, Huawei released the Atlas 900 AI training cluster, which at that time consisted of thousands of Huawei's self-developed Ascend 910 (mainly used for training) AI chips. In July this year, Huawei announced that it plans to have Atlas 900 reach a cluster of more than 16,000 cards by the end of this year or early next year. (In the era of "violent computing" in large models, how can Ascend break through the computing power dilemma? | WAIC2023)

Architecturally, Atlas 900 SuperCluster changes the server stacking model of traditional computing clusters and achieves an integrated design of computing power, transportation capacity, and storage capacity.

Based on the von Neumann architecture, data storage and computing units are separated, which limits the computing power of the computing cluster and increases power consumption. If the computing cluster is compared to a person, for every calculation instruction he receives, he has to go back and read the instruction manual (memory). As the model parameters, complexity, and quantity soar, the back and forth and reading of the manual are extremely frequent. Obviously, " Half the result with twice the result”, and may even “collapse”.

In the film and television drama "The Three-Body Problem", Qin Shi Huang trained a large computer composed of real people, but it collapsed due to the signal speed being too fast.

Therefore, the integrated architecture design avoids frequent data transfer between storage units and computing units, reduces power consumption caused by unnecessary data movement, improves computing efficiency, and makes it easier to break through computing power bottlenecks.

A computing cluster is a combination of multiple servers, and the communication efficiency between servers is also very critical. ‍‍

However, Ethernet, which was previously widely used for data communication between servers, had some significant flaws. The biggest problem is that the transmission timeliness requirements cannot be guaranteed.

Because it uses a carrier sense multiple access protocol with conflict detection, this protocol mechanism is like an "obsessive-compulsive" data sender. It will only send data when other information channels are free, and the sending process will also be monitored. Is there any conflict with other channels? If there is a conflict, it will stop sending data immediately.

This seemingly "responsible" agreement mechanism has two major hidden dangers.  ‍

First, since conflicts can only be detected and there is no way to avoid them, data will be stopped if there are too many conflicts.

Second, if a certain channel sends a large amount of data, other channels will have to wait, resulting in a waste of idle channels. This is obviously not suitable for some scenarios that require real-time response, such as industrial control, online games, etc.  ‍

Moreover, despite the continuous evolution of Ethernet technology, the bandwidth and transmission distance of copper cables have become bottlenecks that are difficult to break through.

Atlas 900 SuperCluster uses Huawei's star AI intelligent computing switch CloudEngine XH16800, which has high-density 800GE port capabilities. The 800GE port is a new data center switch interface launched by Huawei. Its purpose is to achieve higher data transmission efficiency and lower latency.

Using CloudEngine XH16800 and using two-layer switches as network devices, you can achieve an ultra-large-scale network of 2,250 nodes (one node is a server) (equivalent to 18,000 cards), and there is no need to deliberately reduce the number of nodes in the network through algorithms or technologies. Data traffic, etc. to ensure efficiency.

In addition, Wang Tao, Huawei's Managing Director, Director of the ICT Infrastructure Business Management Committee, and President of Enterprise BG, introduced that based on Huawei's comprehensive advantages in computing, network, storage, energy and other fields, Atlas 900 SuperCluster has been developed from the device level, node level, cluster level  and The business level comprehensively improves system reliability and improves the stability of large model training from "day" level to "month" level.  ‍

In addition, the performance of hardware requires more software release.

Huawei has simultaneously released the CANN 7.0 heterogeneous computing architecture, which aims to be "more compatible and open". It is not only compatible with the industry's AI frameworks, acceleration libraries, and mainstream large models, but also opens up underlying capabilities so that the AI ​​framework and acceleration libraries can Call and manage computing resources more directly, allowing developers to customize high-performance operators.

At the same time, Huawei has also upgraded the Ascend C programming language to simplify operator implementation logic, shorten the development cycle of fusion operators, and make AI model and application development faster.

It can be seen that in the face of the development of large models, Huawei is relatively comprehensive in building the computing power base. It is not only blindly stacking cards, but also comprehensively optimizing the speed and stability of the computing power through clusters, networks, software, etc.

02 Perception and linking, how does Huawei’s expertise support intelligence? ‍‍

As we all know, Huawei started in the communications industry, and its communications capabilities accumulated over the years have become the core "data highway." Not only that, Huawei has also accumulated "data acquisition" perception capabilities in recent years.

From the Huawei Industry Intelligent Reference Architecture at the beginning, we can see that the two most basic capabilities, together with the computing power base, are supporting the foundation of intelligence.

Perception is the prerequisite for the intelligence of all physical entities, and it is also the basis for the intelligent transformation of industries.

Huawei's layout in intelligent sensing first started with "intelligent security", focusing on the public safety field. Later, the technology was upgraded to "AI + vision" to expand the management and operation fields, and the department name was also changed to "machine vision".

At the beginning of this year, Huawei once again changed its business unit "Huawei Machine Vision" to "Huawei Industry Perception" in order to leverage the advantages of multiple product portfolios such as terminals, edges, clouds, and networks, and to integrate multiple technologies such as visual perception and light perception.

Over the years, Huawei's sensing technology has covered various types of sensing equipment such as radar, vision, temperature sensing, air pressure sensing, and optical fiber sensing, and can obtain data online in real time.

However, in order to achieve "ubiquitous" sensing, we face a big obstacle - there are many types of sensing terminals, and the protocols are based on seven countries and eight systems, making it difficult to communicate data and support complex business scenarios.

To realize "cars on the same track", we must mention Hongmeng Sensing,  which is an intelligent terminal system with the Hongmeng Intelligent Link operating system as the core.

Compared with the "data fragmentation" and "data islands" caused by the fragmentation of the traditional "IoT sensing terminal operating system" system, Hongmeng system is based on a flexible combination of operating systems, allowing all scene device systems to have a unified core. That is, no matter the size of the device, only one operating system is needed.

Hongmeng Sensing can enable horizontal interoperability between terminal devices. Hongmeng system uses distributed soft bus technology, which can realize the protocol differences of various devices and is shielded by the protocol shelf and software and hardware collaboration layer. The bus hub module is responsible for parsing commands to complete the discovery and connection between devices. Just like in a team, the barriers among the team members are eliminated to the greatest extent and they are coordinated and coordinated by the leader to the greatest extent.

This technology supports seamless collaborative connections and data transmission between devices. In Huawei's own words, "multiple sensing devices automatically collaborate and can be like a physical device."

In addition to Hongmeng Perception, technologies and components such as multi-dimensional perception and synesthesia integration will also be used. But the ultimate goal is to open up the terminal ecosystem, organically collaborate with terminals with complex protocols and isolated systems, and obtain complete and comprehensive information to support subsequent intelligent business processing.

Huawei, which started out as a communications equipment company, has accumulated deeper capabilities in the field of communications. Its technology layout involves satellite communications, Internet of Things, cloud computing, etc., among which 5G technology is considered to be the world's leading level. Recently, Huawei completed all functional tests of 5G-A; the short-range wireless connection technology Star Flash can cover twice the distance of Bluetooth and hundreds of directly connected devices. It is suitable for new energy vehicles, industrial manufacturing and other fields. The “Internet of Everything” provides connectivity capabilities.

Based on these capabilities, Huawei can comprehensively improve the efficiency of intelligent network communications in the industry and smooth the road to intelligent systems.

If the industry intelligent system is compared to a complex logistics system, data upload, data distribution, model training, etc. all involve transmitting various "packages" in this system. If the "packages" encounter problems such as loss and damage, it will It will cause systemic problems.

For example, in a data center, the packet loss rate of the AI ​​training cluster network will greatly affect the computing power efficiency. A packet loss rate of one ten thousandth will cause the computing power to be reduced by 10%, while a packet loss rate of one thousandth will cause the computing power to decrease. Strength is reduced by 30%.

Therefore, in a huge industry intelligent system, absolutely efficient connections must be ensured, mainly involving access networks, wide area networks, and data center networks.

The access network is responsible for the access and aggregation of sensing devices to the data center network or wide area network. Huawei uses 5G-A, F5G Advanced, Wi-Fi 7, Hyper-Converged Ethernet (HCE), IPv6+ and other technologies to achieve stable, reliable, and low-latency sensing device access.

Large enterprises with multiple branches have a large number of data transfer scenarios across branches, such as training data upload, algorithm model delivery, business application delivery, business data transmission, etc. Accordingly, it is necessary to provide stable and reliable data transmission between branches. High bandwidth wide area network.

Enterprises can choose to rent an operator's network or build their own wide area network based on their actual conditions to obtain stable, reliable, high-bandwidth network connection capabilities between multiple branches.

With the rise of AI large models, large model training has become an important responsibility of the data center. Its ultra-large-scale data analysis has also brought new challenges to the data center network. The traditional computer bus-based data center network technology can no longer Meet the requirements for large model training.

Therefore, not only the hardware of computing clusters like the Atlas 900 SuperCluster needs to be upgraded to an integrated architecture, but the data center network also needs a new network architecture to break down the barriers between protocols - allowing "memory access" to go directly to storage and devices, and unifying high-speed on the chip side. Interfaces break the "bandwidth wall" and allow ports to be reused.

Therefore, based on its technological layout in aspects such as sensing and connection, Huawei is able to cover more "edge" and "difficult" areas compared to pure cloud vendors in the deep waters of digitalization and intelligence in many industries.

03 Industry Legion, focusing on “blasting”

After the computing power and connection infrastructure are established, more precise application breakthroughs are needed to face thousands of industries.

"In the industry's intelligent 'rocket', data is the fuel, computing power is the engine, algorithms are the accelerator, and  application deployment is the launcher  ," Wang Tao pointed out.

At the conference, Huawei released financial large-scale model solutions, government large-scale model solutions, smart factory solutions, new energy power prediction AI solutions, etc., covering nine major industries including finance, government affairs, manufacturing, electric power, and railways. plan.

From the industries currently covered by Huawei, we can see its intelligent thinking:  except for the financial industry and the government industry, the other eight major industries are not very digital. It is difficult to break through through pure software. Instead, it needs to start from the bottom. Infrastructure construction has begun, and this may also be the differentiated path Huawei wants to take.

The higher the degree of digitalization in an industry, the faster it becomes intelligent. The financial industry is known as the "training ground" for digitalization. People in the industry also jokingly say that "you will know whether the technology is easy to use or not by going to the battlefield of the financial industry." This has also led many companies to regard the implementation of financial scenarios as their first priority. stand. Huawei has been deeply involved in the government affairs industry for many years. From the digital era to the intelligent era, the accumulation of experience has exploded.

Excluding these two industries, Light Cone Intelligence found that the remaining industries are industries with extremely high entry barriers and are in the deep water zone of digitalization.

Underground coal mining, exploration and development, highways, railways, etc. Not to mention digitalization, the degree of informatization in these industries is very low. A person in the coal industry told Guangcone Intelligence: "Many mines are still in their most primitive stages. Once the signal is penetrated 700 meters into the deep shaft, the signal is lost, and the life or death of the workers is unknown, not to mention data collection."

There are many industries like coal mining, which also shows that the intelligence of many industries cannot be solved by large models alone, but also starts from the infrastructure layer.

Starting from scratch also means relying more on the construction of underlying infrastructure. Taking the coal mining industry as an example, from 2G, 3G, and industrial Ethernet in the automation era to independent servers, WiFi, 4G, and 5G in the information age, no matter which stage ICT facilities are fundamental to its development. After entering the intelligent era, the demands and requirements for hardware and software have also reached a more complex level, with technologies such as WiFi7, 5G, cloud, AI, full connection, full perception, and full computing all available at one go.

The coal mining industry can also get a glimpse of the characteristics of the intelligent era:  in the digital era, it is delivered in the form of software such as PaaS and SaaS, but in the intelligent era, there is no way to "lightweight" and must be accompanied by the upgrade of hardware infrastructure. This is like the transition from traditional mobile phones to smartphones, and from fuel vehicles to smart vehicles.

Based on this idea, it seems possible to understand why Huawei pays more attention to the "software and hardware integration" delivery method. According to Huawei insiders, "Huawei hardly sells software alone. Enterprise BG uses hardware as the core to drive software sales, and Huawei Cloud Computing uses software as the core to drive hardware sales."

From digitalization to intelligence, Huawei's characteristic path has gradually become clear: starting with hardware, opening up the perception layer, and building intelligent solutions for various industries on this basis.

From technology base to upper-level industry scenario applications, compared with other major cloud computing companies, Huawei has obviously done more intelligence. In the logic of integrating software and hardware, software can be copied in batches and delivered lightweight, but hardware is often the opposite, focusing on channels and services.

In order to integrate the logic of software and hardware in its business, Huawei has cultivated a huge industry army.

On October 29, 2021, Huawei founder Ren Zhengfei personally conferred the flag, and the first batch of legions was established, which marked Huawei's "first shot" in entering the industry.

According to relevant media reports, within a few months between 2021 and 2022, three batches and twenty legions of Huawei have sprung up.

In Huawei's army, it is divided into industry army and product portfolio army. The industry army directly faces industry customers and most of them are affiliated with the enterprise BG, while the product portfolio army is under the R&D system. But no matter which type of legion, the goal is very clear, they must face the market and find customers.

They are mainly responsible for two tasks.  One is to penetrate into the companies that purchase solutions,  do a good job in after-sales work, and solve the problems of hardware installation, testing, upgrades, and maintenance;  the other is to continuously develop sales channels and  use channels to The advantages leverage Huawei to deliver more solutions.

Compared with ordinary sales teams, Huawei's industry team is both professional and more in-depth. In an ordinary sales team, salespeople account for a large proportion, but Huawei's industry team includes scientists, R&D experts, service and delivery personnel, industry experts, sales personnel, etc., who can provide support both technically and commercially.

These industry corps are also closer to companies, and it often takes several months to plunge into a city or company. How many links are there in container terminal operations? What are the pain points in each link? What technologies does AI use? Specifically, how are visual inspection and logistics deployment technologies applied?

Problems such as these have been solved time and time again in the actual combat of the industry corps. Huawei has also relied on this to penetrate into industries that are difficult for others to "gnaw".

According to Wang Tao, "Currently, through industry corps operations and extensive cooperation with industry partners, Huawei has created more than 200 intelligent solutions for more than 20 industries such as cities, finance, transportation, and manufacturing, and has been in a series of intelligent be applied in chemical projects.”

"Comprehensive intelligent strategy + technical base capabilities + industry legion 'blasting' capabilities", Huawei has launched a "combination punch" to face the new test of the intelligent era.

Welcome to follow Light Cone Intelligence and get more cutting-edge knowledge of science and technology!

Guess you like

Origin blog.csdn.net/GZZN2019/article/details/133190450