Converged Architecture 3.0: The Key to Reinventing Computing Architecture

In 2019, Turing Award winners John Hennessy and David Patterson wrote a signed article "The New Golden Era of Computer Architecture" published by ACM that: Computing architecture will usher in another golden decade of innovation, and new architectural innovations will bring Lower cost, and better performance, security and energy consumption.

Indeed, when Moore's Law and Dennard Scaling are gradually slowing down or even becoming invalid, the weaknesses of existing computer architectures are becoming increasingly prominent. Especially since 2016, deep learning, machine learning, and large-scale models have made rapid progress. The demand for XPUs such as GPUs and NPUs is strong. The demand for computing power is becoming heterogeneous and diversified.

So, what are the key points and directions for the re-innovation of computing architecture? Component-level architectural innovations such as XPU are impressive, but innovations from the overall perspective of computing power can eventually bring more influential breakthroughs to industry development. Just as Inspur Information has been committed to the exploration of fusion architecture since 2014, the two generations of fusion architecture play a key role in the development of the computing power industry.

Now, on OCP China Day 2023, Inspur Information has released the converged architecture 3.0 prototype system, bringing innovations at multiple levels such as system-level multi-heterogeneous fusion, cabinet-level decoupling and pooling, and resource asynchronous upgrade, providing a new foundation for the computing system. The new golden age of architecture is a rich and colorful stroke.

Inspur Information Launches Converged Architecture 3.0 Prototype System

The data-centric era is coming

If data is the most important factor of production in this era, then AI is the best production tool to realize the value of data.

Today, a data-centric era is accelerating. IDC's "Global Computing Power Index White Paper 203" believes that the future infrastructure will be a data-centric computing architecture. In fact, this trend can also be seen from the development of artificial intelligence in the last ten years. The rise of deep learning, machine learning, and LLM large models has brought about earth-shaking new demands for computing power, which is also a challenge for traditional CPU-centric computing. Architecture has a huge impact.

The arrival of the AIGC wave has brought data scale, parameter volume, and parallel processing scale to a higher level, and has also made the phenomenon of "memory wall", "I/O wall" and "power consumption wall" in the data center more prominent. For example, the parameter volume of OpenAI's GPT-4 has reached an astonishing 1.8 trillion, with 13 trillion Token training, and the future GPT-5 parameter volume is more likely to reach 10 trillion. The substantial increase in demand will also generate massive communication requirements such as gradient data aggregation and distribution, as well as extremely high energy consumption performance.

In addition to the significant computing power challenges brought about by AI, the digital transformation of users in thousands of industries has entered a deep water area, and business scenarios such as cloud, edge, and end are extremely rich and diverse. These business scenarios have a distinctive feature, that is, Data-driven is the core, but the demand for computing power is varied. It puts forward more complex and subdivided requirements for the underlying computing power infrastructure, while traditional computing architectures are difficult to meet in terms of processing capacity, operation and maintenance management, and resource sharing. The deep-seated needs of digital transformation.

Therefore, promoting the development of computing architecture from the perspective of system innovation is the most important direction at present. It is necessary to decouple and pool the overall resources of the infrastructure, and to support rich and diverse application requirements with more fine-grained functional services. Just as Zhao Shuai, General Manager of Inspur Information Server Product Line, said: "The shortcomings of the current computing system architecture have been gradually enlarged. Converged Architecture 3.0 is an exploration of Inspur Information on the overall innovation of the computing system architecture."

Converged Architecture 3.0 Innovation Difficulties

There is a famous saying in "The Mythical Man-Month": "There is no silver bullet in this world." Similarly, the innovation of computing system architecture is a long-term, continuous exploration and iterative process. Only by gradually accumulating innovation can we realize quantitative change to qualitative change. There is no shortcut can go.

The nine-year development history of Inspur information fusion architecture can best reflect this point. In the period of Converged Architecture 1.0, Inspur Information mainly solved the modularization of non-IT resources such as centralized power supply and heat dissipation; Converged Architecture 2.0 realized further upgrades, realized pooling of storage, network and other resources, and utilized virtualization, cloud computing, etc. technology to meet user needs; and Converged Architecture 3.0 has made breakthrough progress, realizing the complete decoupling and pooling of core IT resources such as computing resources, storage resources, memory resources, and heterogeneous acceleration resources, and realizing resources in a software-defined manner. Collaborative dynamic scheduling.

"The efficiency of the converged architecture 3.0 prototype system can be improved by one to two orders of magnitude compared with the previous generation of software virtualization system, the scalability can be increased by 2 to 4 times, and the system delay can be reduced by 90%." Zhao Shuai introduced.

The decoupling and pooling of all resources means that the isolation of resources in the past has been broken. Through overall coordination and scheduling, applications can use resources on demand according to needs, which will undoubtedly have a negative impact on performance, cost, and energy consumption. to fully optimize.

Zhao Shuai said bluntly that Inspur Information encountered two biggest challenges when exploring the fusion architecture 3.0: one is the memory pool resources, and the other is the interconnection of the pooled systems.

As we all know, memory decoupling and pooling have always been the difficulty of computing architecture innovation. Driven by AI large models, etc., the use of large-capacity memory by multiple devices such as CPU, GPU, and FPGA has become the norm, which in turn leads to challenges such as cache consistency after memory resource pooling. The idea of ​​Converged Architecture 3.0 is to develop new memory modules and memory pooling systems that apply serial cache coherence bus and its switching technology, and use CXL interconnection technology to realize high-speed interconnection between multiple devices, providing large-scale memory expansion and memory Resource pooling provides low-latency access paths and cache consistency guarantees to meet resource sharing and efficient computing scheduling requirements after memory pooling.

Zhao Shuai introduced: "Using the CXL high-speed interconnection technology, the remote memory can achieve a similar delay as the local memory. CXL has released version 3.0, and the data transmission rate has been increased to 64GT/s. With more AI-related processors connected Into the CXL switching network, the entire system memory can be shared globally at the hardware level, which will significantly alleviate the problem of the 'memory wall' in AI large model training."

 

Due to the realization of cabinet-level decoupling and pooling, this means that the data rate continues to rise and the system links become more complex. After pooling, the interconnection design becomes extremely important, which is to ensure resource coordination scheduling and flexible on-demand after pooling. Use the key. The solution of Inspur Information is to conduct high-precision fitting simulation research on the high-speed interconnection of complex links in the fusion architecture 3.0, and accurately analyze the limit of the diversified topology and transmission rate of the system interconnection links. And by exploring the optical interconnection technology of the internal bus of the server, the transmission distance of the link is extended, and the large-scale resource decoupling pooling of the data center is realized.

Wu An, deputy general manager of Inspur Information Technology R&D Department, believes: "Converged architecture 3.0 follows three steps from the design point of view. First, decoupling, then pooling different resources, and after pooling is reconstruction. In this process, Interconnection is the core. For example, after decoupling pooling, it involves how to coordinate management and control of timing, clock management, power supply management, and heat dissipation management; logic units, pooling management, and strategy automation when reconfiguring resources all need interconnected technology to coordinate.”

In fact, the emergence of the prototype system of Inspur Information Fusion Architecture 3.0 will gradually break the model of iteration of future computing power products. As we all know, in the past, the update and iteration of computing power products such as servers used processor updates as the basis for product updates. Today, the emergence of the converged architecture 3.0 prototype system is expected to truly realize data processing as the demand center and update and iterate according to the user's business needs.

Wu An introduced: "The converged architecture 3.0 provides a more imaginative way of asynchronous iteration. Because the converged architecture 3.0 no longer takes the CPU processor as the core, but data processing as the core. For example, many users do not need Soon to be upgraded to DDR 5, the business does not care about the bandwidth rate increase of DDR 5, but hopes to take advantage of DDR 4 delay, price and other advantages. .”

John Hennessy and David Patterson also believed in "The New Golden Age of Computer Architecture" that the vertical integration of future computing architecture will become extremely important. Obviously, the Converged Architecture 3.0 Prototype System is an important exploration of architecture innovation. It forms a system-level solution from an overall perspective on the basis of breakthroughs in multiple technical points.

"Converged Architecture 3.0 is currently a prototype system, and there will be more breakthroughs in technology in the future, so as to achieve better landing results." Zhao Shuai said.

The key to open architecture re-innovation

In recent years, the industry has been calling for innovations in computing architectures one after another. Among the many manufacturers, Inspur Information is one of the few manufacturers with clear route planning and steady progress. With the release of the converged architecture 3.0 prototype system, Inspur Information, a leader in the computing power industry, is also expected to use the converged architecture 3.0 as a starting point to lead the entire industry to accelerate the opening of the golden age of computing architecture.

First of all, Converged Architecture 3.0, as a breakthrough in the innovation and exploration of computing system architecture, is like an open ecology, which will greatly reduce the threshold for the integration and adoption of new technologies, and is expected to drive innovation activity in the entire computing power industry. For example, the key reason why GPUs are popular in the fields of AI large models and other fields is the maturity of its ecology such as tools. As AI large models are gradually integrated into all walks of life, a large number of reasoning and For training needs, more excellent new technologies can be introduced through Fusion Architecture 3.0.

Secondly, the release of the Converged Architecture 3.0 prototype system is expected to accelerate the implementation of innovative technologies into the digital transformation of thousands of industries. At present, the converged architecture is mainly used by Internet users, but as the digital transformation of thousands of industries enters the deep water area, the challenges Internet users encounter in infrastructure today may be the challenges that industry users will encounter in the future. Therefore, Converged Architecture 3.0 can quickly bridge the technical gap of users in traditional industries and realize accelerated business transformation.

Undoubtedly, the next decade will be the "Cambrian" of computing system architecture innovation. Various innovations will inevitably explode and emerge in endlessly. An exciting era has already begun. Inspur Information Fusion Architecture 3.0 Prototype System is undoubtedly a bold innovation in computing architecture, which has far-reaching significance for the industry.

"Facing the future, various businesses of enterprises will rely more and more on data and value, and computing power technology also needs to evolve continuously to help enterprises improve data processing efficiency and maximize the release of data value." Zhao Shuai finally said.

Guess you like

Origin blog.csdn.net/dobigdata/article/details/132521312