Inspur Yunhai Liu Jian: "One cloud, multiple cores + cloud native" is the optimal solution for heterogeneous computing power

Adhering to the technical concepts of openness, compatibility, and hierarchical decoupling, Inspur Cloudsea provides leading private cloud products and solutions to users in all industries, helping enterprises build a solid cloud base and achieve digital reconstruction and transformation. In the process of cloud native and cloud computing construction in the financial field, Inspur Yunhai has accumulated rich practical experience. With its innovative and pragmatic hard work concept and customer demand-centered service awareness, it has been widely recognized by the industry and customers.

 

Picture: Liu Jian, Director of Inspur Data Cloud Computing Program

 This article is a transcript of a speech by Liu Jian, Director of Inspur Data Cloud Computing Program. The following content will share relevant experience from three parts: the development trend of cloud computing, the trend and challenges of financial cloud, and the practical results of Inspur Cloud Sea in the construction of cloud native infrastructure. and discussion of perspectives.

1. Cloud computing development trend: next generation cloud data center

At present, terms such as AI, 5G, and big data have become inseparable from cloud computing. In this environment, Inspur Yunhai believes that open hardware, open software, and hierarchical decoupling of software and hardware have become important trends in the development of data centers. At the same time, in the face of For more diverse computing scenarios, the evolution of cloud computing architecture, model-as-a-service, and multi-computing power management capabilities have also become the evolutionary direction of next-generation cloud data centers.

Software and hardware synchronization optimization

In order to meet users' needs for product performance and user experience, Inspur Yunhai also pays attention to the simultaneous optimization of software and hardware on the basis of advocating the decoupling of software and hardware. In the field of cloud computing, through in-depth optimization of hardware, product performance can be greatly improved to solve the loss problem of software layer containers and virtualization; DPU is used to change the underlying high-availability architecture to achieve high-availability logical switching of virtual machines.

Extensive software and hardware collaboration

The decoupling of computing system architecture and interoperability between computing devices are key to the sustainable development of future computing. At the network level, hardware SDN and network equipment are easily bound. In the decoupling stage, through cloud soft SDN and network equipment from any manufacturer, coupled with GPU and smart network card acceleration, unbinding and performance improvement can be achieved; At the storage system level, the same method can be used to accelerate the storage system; at the security level, DPU can be used to enhance the computer system load. The above are all achieved through hardware optimization to speed up software performance. 

Model as infrastructure

As ChatGPT continues to gain popularity, models, like computing power and algorithms, have become the infrastructure standard for the next generation of clouds. However, from the perspective of ownership division, the model trained on the public cloud still belongs to the public cloud. Only by training based on your own data on the private cloud can you obtain an exclusive large model. This is also the current development direction of large domestic financial institutions.

Multiple computing power

Support the development of emerging businesses and facilitate diverse computing power scenarios. Today's data centers have more diversified infrastructure from the physical resource level, and the mixed deployment of X86 and ARM devices has become the norm; users' demand for computing engines also includes virtualization, bare metal, and containers, and this state will continue for a long time. ; The type of computing power has also expanded from pure CPU to GPU and FPGA.

2. Financial cloud trends and challenges
Financial cloud development trends

The development of financial cloud can be roughly divided into three stages: IOE era, business cloud era, and cloud native era. Cloud native capabilities can be summarized into the following three points: containerization of applications, meshing of services, and serverless. Financial IT essentially serves organizations and businesses, so changes in upper-level technology often stem from changes in organizational structure. The business architecture has evolved from monolithic to servitization to microservice architecture, the data architecture has evolved from statistical analysis to integrating data service lakes and warehouses, and the technical architecture has evolved accordingly to a service grid, ultimately leading to changes in the organizational structure and development framework. At each stage, the organizational structure and technical structure need to match, which is a gradual evolutionary process.

Financial cloud construction challenges

The construction experience of the IaaS layer is relatively common, but the construction of the PaaS layer is more complicated. We make the following analysis:

  1. Both IaaS and PaaS have obvious industry attributes. If a cloud vendor sells the same PaaS to all customers, it does not have industry attributes. The businesses of each industry are different, and the required PaaS indicators and components are also different. Standardized products may not be applicable.
  2. Containers as a service. In many scenarios, users think that PaaS is difficult to use and inflexible because manufacturers integrate PaaS infrastructure with PaaS services. In extreme cases, each product may come with a container platform. In this regard, Inspur Yunhai’s construction experience is to decouple the functions of PaaS and make it a container-as-a-service.
  3. In the process of building the business layer, we recommend that users build a unified cloud infrastructure, choose PaaS vendors whose businesses can be decoupled, and build the decoupled capabilities on a unified container-as-a-service platform. 

The construction of the microservice system also involves the coexistence and gradual evolution of multiple architectures: within the microservice framework, Dubbo and SpringCloud are currently popular application distribution and microservice development frameworks, and are widely used in the financial industry; and the current development of Istio The trend is towards platform-level service governance frameworks that enable non-intrusive microservice transformation of legacy systems. Of course, the choice of microservice architecture is inseparable from platform planning and company planning, and it also needs to evolve gradually.

For the operation and maintenance of microservices, we recommend connecting the IaaS layer and PaaS layer to form a vertical operation and maintenance system. The following challenges are faced here: there are many PaaS component versions and development frameworks, and different PaaS components have different hardware resources, resulting in low deployment efficiency; when the business data network is isolated, how to solve the cross-domain use of PaaS. In this regard, we recommend that in the case of network partitioning, deploy and orchestrate on demand according to the use environment requirements, and build a unified distribution network to open up the intermediate operation and maintenance side, and distribute PaaS to different domains through the operation and maintenance network. , through nearby deployment, nearby access, unified operation and maintenance, and the unification of the PaaS layer.

3. Inspur Yunhai Cloud Native Infrastructure Innovation and Practice

At present, Inspur Yunhai is mainly focusing on building the bottom layer of cloud native infrastructure. With reference to the "Cloud Native Capability Maturity Standard" issued by the Academy of Information and Communications Technology, it focuses on two directions to carry out work: first, technical architecture, mainly resource management, operation and maintenance Guarantee, R&D testing, etc.; second, business applications, mainly elasticity, high availability, automation, observability, etc.

“One cloud, multiple cores + cloud native”

In the construction practice of financial cloud, "one cloud, multiple cores" is an important basic indicator of cloud in the financial industry. "One cloud, multiple cores" can meet the diversified needs of users for computing power, and can effectively avoid computing power islands; it is the key link to break up small ecology and build a large ecology; and can effectively reduce supply chain risks. Whether from a business perspective or a technical perspective , or from the perspective of the industry chain, the implementation of "one cloud, multiple cores" has become the key to the current and future development of the cloud computing industry, and is an inevitable choice for manufacturers related to the industry chain. Now, the energy and power industries have also put forward clear requirements for one cloud, multiple cores requirements.

Inspur Yunhai believes that "one cloud, multiple cores + cloud native" is the optimal solution to solve the problem of heterogeneous computing power. We have summarized the following practical experience based on the two business forms of stateless and stateful:

  1. For stateless applications, they are basically developed based on Java. The compilation process is not difficult. Recompiled applications can run in multi-core clusters without many restrictions on the underlying CPU or server;
  2. For stateful data, the most important thing is to ensure that the database data is not lost, and it is not necessarily necessary to pursue one cloud and multiple cores. However, related attempts can be made, such as deploying a distributed database in a one-cloud, multi-core environment. X86 computing power can be used to support the primary cluster or write operations, and non-X86 computing power can be used to carry standby or read operations. This is a one-cloud, multi-core database. A way to achieve it. This model can also be applied to disaster recovery construction at the database level.

In addition to considering the business form, we also summarized the following construction points during practice:

  1. Automatic equal-cost scheduling of computing power within the cluster: Because there is a computing power conversion problem between servers of different architectures; in this regard, we have teamed up with the Academy of Information and Communications Technology and multiple manufacturers to conduct automatic equal-cost scheduling testing of computing power;
  2. Traffic switching: In practice, it can be achieved through gateway switching;
  3. Insensitive switching: At present, our products already have this capability. The architecture of the user's underlying infrastructure will not affect the operation of the upper-layer business. Users can dynamically adjust and call resources based on the underlying resources of different architectures.
Construction of microservice architecture system

For the construction of microservice architecture system, as mentioned above, the first is the coexistence of development frameworks. In this regard, we recommend that the various architectures be unified and managed through the configuration center, and common things can be extracted first and then further integrated. Secondly, there is the problem of multiple data centers. Currently, users generally have multiple data centers. For this, they can be managed through cascading and the total points are used for management. 

Highly available design 

The high-availability design of the next generation cloud is not difficult to implement at the cloud-native level, but not all businesses are cloud-native, which makes it difficult to solve the high-availability problem from the upper level. As an infrastructure vendor, we advocate that through underlying construction, high availability can be achieved even without relying on cloud native. Therefore, for the next generation of high-availability architecture, we hope to be able to switch back and forth between these two dimensions. We also hope to open up the control plane of bare metal and virtual machines to achieve high availability among multiple engines. There are still many challenges and still need to be gradually evolved.

At present, Inspur Cloud has served more than 15,000 customers and has blossomed in various industries, covering key fields such as finance, energy, transportation, medical care, enterprise, and education. It is an important cloud base for customers' digital and intelligent transformation. This includes the largest financial production cloud in China, carrying customers' Double 11 business; the largest provincial government cloud with the most types of chips, carrying 4 sets of public application services and 104 business systems; as well as automobile, rail transit, scientific computing Laboratory and many other large-scale projects worth tens of millions of dollars.

​​​​​​​( Reprinted from the Inspur Yunhai official account)

Guess you like

Origin blog.csdn.net/annawanglhong/article/details/132530183