Alibaba Cloud Bear Eagle: Evolution and Practice of Edge Cloud Native Architecture Based on Converged and Collaborative System

Cloud native and edge computing have been very hot technical topics in the past two years. At the 10th Cloud Computing Standards and Application Conference, Alibaba Cloud senior technical expert Xiong Ying shared the "Evolution of Edge Cloud Native Architecture Based on Converged and Collaborative Systems "And practice", I hope that by introducing the current system architecture evolution of Alibaba Cloud in the technical fields of edge computing and edge cloud native, let everyone know some thinking about where the business is in cloud native and edge computing combined scenarios.

Overview

In recent years, the development of edge computing has been very rapid. You can see its definitions in various standards and materials. Here is a summary of several basic concepts.

【origin】

The origin of the concept of edge computing can be traced back to a relatively long period of time, but the real rise is mainly due to the development of 5G; the development of 4G has caused the explosion of mobile Internet, so edge computing has been given great expectations in the 5G era. Hope Become a new industry track; on the other hand, the three major scenarios defined in the 5G standard 3GPP, large bandwidth, low latency, and wide connectivity, will strengthen the application scenarios of edge computing from all aspects, and the resulting telecommunications The transformation of infrastructure has allowed computing to sink further. From the Internet to the core network and to the access network, computing is becoming closer and closer to users.

【definition】

Regarding the definition of edge computing, in different fields and perspectives, operators, cloud service vendors, and hardware vendors have different definitions of edge computing. In Alibaba Cloud's edge cloud standard, edge cloud is defined as: providing distributed, definable, schedulable, standard open and secure computing platforms and services at network nodes close to terminals (people and things). The goal is to expand the boundaries of the cloud, make computing and connection closer to things, and make it the cornerstone of the interconnection of everything.

【feature】

Compared with the central cloud, the edge nodes are scattered and multi-level, with many nodes and small volume, not only at the regional level, but also at the prefecture-level and park-level. In the 5G scenario, it will sink even more. To the access network, the network between the cloud and the edge, and the network between the edge and the edge may all be Internet channels.

【challenge】

Massive, distributed, and heterogeneous edge node resource characteristics will bring huge challenges to the business: multiple network entrances represent the unavailability of unified traffic monitoring and elastic scaling strategies; too many nodes but small size, It means that the elasticity of a single cluster is weak, but the overall elasticity is strong; the management of massive nodes and the network environment of the Internet have a big impact on high availability, disaster recovery, migration, etc.

Generally speaking, due to the proposal and layout of new infrastructure, 5G, Internet of Things, and Industrial Internet have received extensive attention. The acceleration of the commercialization and industrialization of 5G has made the underlying infrastructure more mature. This year has spawned a large number of new industries, such as cloud applications, cloud games, interactive entertainment, industrial Internet 2.0, and so on. Promoting the rapid change and evolution of the overall technical architecture.

Infrastructure evolution

First, let’s introduce the evolution of edge infrastructure. Alibaba Cloud defines three stages according to business forms:

The first stage is that the edge cloud is ready. At this stage, users just migrate applications running on physical machines to a virtualized environment. This process is driven by cost reduction. Users no longer build nodes by themselves. The operation and maintenance of the underlying physical facilities are transferred to the edge cloud for processing, and the application development and operation and maintenance methods are not very different.

The second stage is edge cloud native. Users hope to further reduce the overall cost of ownership, improve system capabilities and R&D efficiency, and use standardized and automated methods to manage resources, deliver applications, and operate and maintain the system. Users conduct in-depth development and maintenance based on K8S. Customization, integrating edge resources, adapting to the characteristics of the edge, and developing and building its own PaaS platform on top of it for internal business use.

Everyone should be familiar with the first two stages, which are similar to the evolution process of the central cloud;

The third stage is edge integration and cloud native, which should be regarded as a relatively new concept. This is a stage that Alibaba Cloud has explored and defined in practice, combined with user business thinking.

To expand, the edge is characterized by distributed, small and large resources, and complex network conditions, so users need to pay attention to the stability of the infrastructure at any time, and switch and migrate services and data; in addition, the flexibility is not strong, so the user’s Business is also difficult to use on-demand; in addition, the integration of various edge capabilities in the technical architecture also requires users to go deeper into K8S and have custom development capabilities. In summary, users need to perceive the underlying resources, infrastructure, and even inventory, water level, planning, etc., and the technical challenges and difficulties for the business to sink to the edge are great. In the edge convergence cloud native, what brings users is that they do not need to care about the underlying infrastructure facilities at the edge, and they can enjoy the flexibility, high availability, and on-demand use capabilities. Edge convergence cloud native should shield the edge characteristics of heterogeneous resources, multi-cluster, and inventory water level; precipitate and open up the capabilities of resource scheduling, elastic scaling, and multi-level collaboration; use the good scalability of cloud native to consolidate resources , Ability to abstract and merge; At the same time, it is necessary to provide unified standard interface encapsulation for common business scenarios and emerging business scenarios; release these capabilities to users.

System architecture evolution

In the practice of technological architecture evolution, we also carry out hierarchical design according to the idea just now:

Infrastructure layer: It has the capabilities of heterogeneous resource management, multi-level network architecture, and converged storage forms to solve the problems of integration and management of underlying resources, integrated production, and abstract shielding;
cloud edge collaboration layer: capable of computing, storage, and network flow Ability, with cloud-side, edge-side, and multi-cloud collaboration capabilities to solve the problems of various capabilities and system collaboration;
platform engine layer: with edge cloud native abstract integration capabilities to solve resources, components and applications, scheduling, orchestration capabilities The problem of integration and integration;
business scenario layer: it has the ability of unified interface, business precipitation, and scenario deepening to solve the problem of developer ecological closed loop.

It can be expected that with the continuous evolution and improvement of 5G technology and infrastructure, as well as the development of innovative services, the system architecture will also evolve and change accordingly.

It's always shallow on paper, and I absolutely know how to do it. Next, Ying Xiong explained the capabilities and designs of each layer one by one by introducing Alibaba Cloud's actual business practices.

Application case-stateless application

This scenario is mainly aimed at task-based services (such as pressure testing, dial testing, offline transcoding tasks), or peer-to-peer network systems (P2P transmission networks), etc. These types of services have higher requirements for elastic scalability and The cost is highly sensitive, but the requirements for location and high availability are not high. Due to the weak flexibility of edge single nodes, but the strong flexibility of global resources, this scenario is a typical application scenario that tests the capabilities of edge computing infrastructure. In terms of architecture, it needs to have a unified inventory of global resources, integrated scheduling, and collaborative orchestration capabilities: in terms of computing form, it must support multiple forms of integrated computing such as virtual machines, containers, and secure containers to meet different scenarios In terms of resource inventory, there must be a converged resource pool; in terms of scheduling and orchestration, there is also a coordinated and unified scheduling capability; this can provide flexibility in event triggering and traffic burst scenarios The ability to scale and use on demand can also greatly reduce the cost of users.

Application case-stateful application

In this scenario, in addition to computing and flexible hosting, the business also hosted domain names and scheduling. In addition, due to the complexity of the business, the architecture is becoming increasingly complex. First, in a single cluster, the system needs to be split into multiple independently working microservices; second, multiple microservices have a mutual orchestration dependency relationship between themselves and among them; third, between the cloud and the edge (management and control and There will be a need for collaborative communication between business), edge and edge (cluster and cluster); finally, add domain name and traffic scheduling, SLB, database, middleware and other general capabilities and component integration requirements; from this perspective, The application scenario at the edge is no less complex than the application in the central cloud, and at the same time, the edge is distributed, multi-cluster, and wide scheduling characteristics; distributed cloud computing is a more appropriate description of this scenario.

How to solve the business needs in terms of architecture? At the bottom of the infrastructure, the product capabilities include distributed SLB, distributed DB, etc.; in terms of network capabilities, programmable and configurable cloud-side and edge-side overlay network capabilities are added; on the collaboration layer, cloud-side collaboration, Side-by-side collaboration, dynamic balance of traffic and resources are also core capabilities; at the engine layer, it is necessary to have in-depth development of cloud-native capabilities that adapt to the edge, such as the K8S multi-cluster management federation capability introduced by massive node management and business resolution. Multi-tenant isolation Virutal Cluster capabilities, Service Mesh components that solve service discovery and collaborative communication in the microservice architecture, CNI and CSI components that adapt to edge virtual networks, virtual storage, and so on.

Xiong Ying: "There are not too many standards and specifications for the concept of distributed cloud computing. A complex application needs to be distributed from the center to the edge, and a lot of system architecture transformation and adaptation work needs to be done. This is also Ali The direction of cloud efforts is to accumulate more platform capabilities and form a closed loop of development ecology, so that distributed cloud computing can easily land on the edge."

Application case-terminal cloud

This business scenario is very hot this year, typically in the two areas of cloud gaming and cloud applications. The system or application that the business will run on the terminal is hosted in the cloud to reduce terminal costs and also lower the barriers to entry for high-quality services. In the edge convergence cloud native, there will be a fundamental conceptual change: from resource hosting, application hosting to device hosting, location-free hosting. At the infrastructure layer and engine layer, the resources are initially encapsulated for various heterogeneous resources, and a layer of unified standard virtualized resources is abstracted to provide security and isolation capabilities; on the business layer, another layer is implemented Encapsulation shields the resource attributes and no longer provides the concept of resources. Instead, the concept of equipment is replaced. At the same time, the capabilities of collaborative computing, collaborative storage, and collaborative network are added to the collaboration layer to allow virtual devices to flow. In business, you can no longer see the concept of applications and resources in the traditional sense. You can only see the management and control capabilities of one virtual device, such as device data, device applications, and device scheduling.

Here to emphasize the concept of digital twins, Xiong Ying said: In the era of the Internet of Everything, behind every physical terminal, there will eventually be a shadow terminal on the edge cloud, either a data carrier or an extension of the system.

Application case-super clear video

This scenario is still in the stage of technical exploration, and it is a scenario that really sinks to the 5G MEC node. It is hoped that a replicable general technical architecture model in the 5G field can be created. The most important thing in this scenario is to be able to open up the coordination layer and the operator's MEC system resource coordination, traffic scheduling, network shunt coordination capabilities. In the 5G/MEC era, the computing power continues to sink. To the access network, to the MEC node, general protocols such as DNS protocol will not be able to meet the precise scheduling requirements; on the one hand, scheduling will need to make decisions based on the precise geographic information of the terminal, and on the other Decisions must also be made according to the business demand scenarios; for example, for extremely high real-time services such as positioning and AR/VR, they will be placed in the access computer room to meet the real-time requirements; high transmission bandwidth such as video analysis can save business and The higher real-time services of cloud games will be placed in the convergent computer room, taking into account the needs of both functions and real-time; while the heavy computing/large storage business will be placed in the reconvergent computer room or core computer room; the design of multi-level computing and multi-level network allows The capabilities of the entire system are more powerful and rich.

Edge-converged cloud native hopes to flexibly choose the deployment location of services based on scenarios, while taking into account the requirements of service delays and computing capabilities, to meet the needs of various services. Of course, these capabilities should be encapsulated and abstracted and provided to the upper layer, and users and businesses should not be aware of the complexity of the underlying infrastructure.

to sum up

In the 5G era, the application scenarios of terminal cloud, VR/AR, edge AI, industrial Internet, and smart agriculture will gradually explode; in some proprietary fields, heavyweight application scenarios have already landed; but in the field of general Internet technology, The real 5G Killer application has not yet appeared, or the technical architecture that truly combines 5G technology and infrastructure has yet to evolve. Ying Xiong is very much looking forward to the edge computing platform that can integrate and schedule resources in the multi-level network through co-construction and cooperation, and realize the real connection and collaboration of the cloud edge, based on cloud native technology, to provide the industry with an open and standard cloud edge Collaboration and cloud-network integration capabilities allow more applications to easily sink to the edge and realize the era of Internet of Everything.

 

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/weixin_43970890/article/details/112794056