How OpenYurt "0 Invasion" Breaks Through Cloud Edge Fusion Difficulties

Head picture.png

Author | He Linbo
Source | Alibaba Cloud Native Official Account

With the development of industries and services such as 5G, IoT, live broadcasting, and CDN, more and more computing power and services have begun to sink closer to data sources or end users in order to obtain good response time and cost. This is a computing method that is clearly different from the traditional central model-edge computing.

However, while the scale and complexity of edge computing are increasing day by day, the shortage of operation and maintenance methods and capabilities have finally begun to be overwhelmed. In this context, "cloud, edge, and end integrated operation and maintenance coordination" has begun to become an architectural consensus. Through cloud native blessing, the process of cloud edge integration is also being accelerated sharply. Under this trend, the introduction of cloud-native concepts and a comprehensive transformation of the operation and maintenance management model of edge applications have become urgent problems to be solved.

This article is compiled from the author's Alibaba Cloud container service technical expert, OpenYurt author & one of the start-ups, He Linbo (Xinsheng), shared the live broadcast on the "Tuesday Open Source Day" of the Alibaba Cloud Developer Community on January 26, and will stand on the ground. From the perspective of scenarios, explore the integration challenges of cloud native technology and edge computing, and introduce in detail the cloud native edge computing platform architecture and industry practices based on OpenYurt.

Click to see the full video:https://developer.aliyun.com/live/246066

What is Edge Computing

Edge computing is a computing method that deploys workloads on the edge compared to traditional centralized general computing. In recent years, edge computing has been very hot, mainly because 5G, IoT and other services and scenarios are developing faster and faster, including more and more intelligent terminal devices, which has caused more and more sinking demands for edge computing services. If all processing is placed in the center, it is difficult to meet the growth of large-scale edge smart devices. Edge computing is currently being used on a large scale in various industries, such as automobiles, transportation, and energy. In summary, edge computing is to make computing closer to the user or to the data source.

1.jpg

1. Edge computing top-level architecture

The industry-defined layered structure of edge computing mainly refers to Gartner and IDC.

The hierarchical structure defined by Gartner is shown in the following figure: Endpoint> Near edge> Far edge> Cloud> Enterprise.

2.jpg

  • Near Edge: Non-standard server or equipment, located at the place closest to the end side.

  • Far Edge: The standard IDC can be divided into three types: IDC (primary), MEC, CDN, etc.; relatively speaking, the computing power is relatively strong, such as the computer room of the operator, the online room of the cloud service provider, and so on.

  • Cloud: Public cloud or proprietary meta-service, characterized by centralized and centralized management of resources.

The hierarchical structure defined by IDC is shown in the figure below:

3.jpg

  • Heavy Edge: Data center dimension; centralized computing platform; CDN, self-built IDC.

  • Light Edge: Low-power computing platform, suitable for industrial control, data processing, transmission and other IoT scenarios.

    As can be seen from the figure above, the Gartner definition and IDC definition are actually interdependent and interrelated. In addition, edge computing and cloud computing are not a substitute relationship, but a complementary and interrelated relationship.

2. Edge computing industry trends

The trend of the edge computing industry can be viewed from the following three aspects (dimensions): the first is the business of the industry, the second is the structure of the industry, and the third is the scale and changes of the industry .

Trend 1: The integration of AI, IoT and edge computing

In recent years, there have been many combinations of edge computing, AI, and IoT. After the increase in the number of edge smart devices, all data or videos including all data or videos are sent back to the cloud for processing. The entire cost and efficiency are very inappropriate, so it is directly close to the device. There is an increasing demand for AI processing or IoT processing on this side. For example, AI will train on the cloud or in the central cloud, and then do inference at the edge. There are many such forms. the survey suggests:

  • By 2024, 50% of computer vision and speech recognition models will be running at the edge.

  • By 2023, nearly 20% of the servers used to handle AI workloads will be deployed on the edge.

  • By 2023, 70% of China's IoT projects will include AI functions, pursuing real-time performance, reducing bandwidth, and data compliance.

  • By 2023, 75% of Chinese enterprises will process IoT data at the edge of the network.

4.jpg

Trend 2: Cloud extension, IT decentralization, facility autonomy, edge hosting

Edge computing and cloud computing are complementary and interdependent. To extend it one step further, edge computing is actually an extension of cloud computing to the edge, extending some capabilities of the cloud to the edge. One is to require IT services to be decentralized on the edge side. In addition, because edge services or facilities are autonomous, when the network between the cloud and the edge is disconnected, there is a certain degree of control capability, and then edge hosting capabilities. The future architecture trend will evolve toward the development route of cloud extension, IT centralization, facility autonomy, and edge hosting:

  • Hybrid Cloud-By 2023, 10% of enterprise loads will run on local data centers and edge resources.

  • Decentralization-By 2023, more than 30% of new infrastructure will be deployed at the edge.

  • Facility Autonomy-By 2024, 50% of core enterprise data centers and 75% of major edge IT sites will change the way operation and maintenance.

  • Edge hosting-By 2022, 50% of companies will rely on hosting services to improve edge AI-based performance and return on investment.

5.jpg

Trend 3: 5G and edge computing detonate new growth

In recent years, the rapid development of 5G is a new growth tipping point for edge computing. It is estimated that by 2024, the number of edge applications will increase by 800%. It is conceivable what kind of growth will be in the back of this industry. Typical application scenarios will include the Internet of Vehicles (autonomous driving/vehicle-road collaboration), smart grid (equipment inspection/precise load control), industrial production control, smart medical treatment (remote ultrasound/remote consultation), etc.

6.jpg

3. Current status of edge computing

Edge cloud promotes rapid increase in management complexity

With the increasing shape, scale and complexity of edge computing, the operation and maintenance methods and capabilities in the field of edge computing are becoming more and more difficult to meet the innovation speed of edge business; and "in the future, enterprises are fully pursuing ultra-scale and ultra-high-speed , Hyper-connection", further aggravate the complexity of operation and maintenance management. Edge cloud promotes the rapid increase in management complexity, mainly from the following four aspects:

  • First, the number of Internet smart terminal devices has increased dramatically; demand for data and business sinking has increased .

  • Second, the scale and complexity of edge computing have increased , and new services such as edge intelligence, edge real-time computing, and edge analysis continue to emerge. The traditional cloud computing center's centralized storage and computing model can no longer meet the needs of edge devices for timeliness, capacity, and computing power.

  • Third, cloud-side-end collaboration is difficult , lack of unified delivery, operation and maintenance, and management and control standards, and the security risk control of edge services and edge data is difficult.

  • Fourth, it is difficult to support heterogeneous resources, supporting different hardware architectures, hardware specifications, and communication protocols, as well as the challenges of providing standardized and unified service capabilities based on differences in heterogeneous resources, networks, and scale.

Cloud and edge integrated edge cloud native

1. What is cloud native?

Definition of cloud native: Yunyuan is an open and standard technology system. Based on the cloud-native technology system, a set of business systems that are highly flexible, fault-tolerant, and easy to manage can be constructed and run agilely for users. There are many popular technologies in the entire technical system, such as Cloud Native, Serverless, Kubernetes, Container, Docker, etc., which are widely used in the industry.

7.jpg

Now major cloud vendors and cloud service providers are investing heavily in cloud native, and cloud native has increasingly become the entrance for users to use cloud computing capabilities. The cloud native technology system can help enterprises maximize the use of cloud capabilities and maximize the value of cloud.

2. Rich cloud native product family

Taking Alibaba Cloud as an example, the cloud native product family is mainly divided into three parts, as shown in the following figure:

8.jpg

  • The first piece is the new application load, including: data & intelligence, distributed applications, and DevOps, all of which now carry services natively through the cloud.

  • The second block includes: Serverless, container orchestration, which is a new business system.

  • The third block includes: public cloud, proprietary cloud, edge cloud is a new resource carrying system.

3. Cloud Edge One Cloud Native Infrastructure

The cloud-side integrated cloud-native infrastructure is a cloud-native system with cloud management and edge autonomy. As shown below:

9.jpg

On the side of the center, the management and control capabilities and productization capabilities of the native cloud center can be provided. For example, the capabilities of using Kubernetes + storage / + AI / + big data can be provided in the center; these capabilities of the center are sinking through the control channel Edge computing, such as standardized CDN, Infrastructure, Edge, ENS, or device gateways for smart factories, smart parks, buildings, airports, etc. on the right side of the figure above; at the edge, you can access various devices, such as sensors, video , Controller, etc., can support various communication equipment access. In this way, an integrated cloud-native infrastructure is formed.

Cloud computing is good at data processing and analysis that requires massive scalable storage capabilities, non-real-time and relatively long periods, while edge computing is born out of cloud computing, it is good at real-time processing and analysis of local short-period data, cloud computing and edge computing The relationship is not a substitute, but a collaborative relationship. Only when the two are closely integrated can they better meet the matching of various demand scenarios.

4. Cloud Edge One Value

The concept of cloud native was first proposed in 2013. After several years of development, especially since Google took the lead in establishing the CNCF in 2015, cloud native technology has begun to enter the public’s attention and gradually evolved to include DevOps, continuous delivery, and microservices. , Containers, infrastructure, Serverless, FaaS and a series of technologies, practices and methodologies. Cloud native accelerates the integration of multi-cloud and cloud-side. The value of cloud-side integration is:

  • First, it can provide users with the same functions and experience on any infrastructure as those on the cloud, and realize the integrated application of the cloud side.

  • Second, the isolation of containers can be used to ensure the security of services running on the edge by using the system's flow control, network strategy and other capabilities.

  • The third is through containerization, through decoupling between containers and resources, the support for heterogeneous resources can be well adapted.

10.jpg

5. Difficulties in the integration of cloud native and edge computing

With the increasing shape, scale and complexity of edge computing, the operation and maintenance methods and capabilities in the field of edge computing are becoming more and more difficult to meet the innovation speed of edge business; and in the future, enterprises are fully pursuing "super-scale, ultra-high-speed , Hyper-connection", further aggravate the complexity of operation and maintenance management.

What are the problems that cloud native and edge computing must solve? In the actual problem-solving process, the following 4 key points are summarized:

The first point: the scale and business of edge computing are complex, edge resources are scattered in different regions, the life cycle management, upgrade, expansion and contraction of edge applications in each region, and the closed loop of intra-regional traffic are all facing challenges.

For example, in the CDN scenario, there may be hundreds of computer rooms across the country, and the resources or machines in each computer room may be different, and the traffic carried by the services running on the machines may also be different. At this time, if the native Kubernetes workload is used for management, it is very insufficient, which will pose a very big challenge, error-prone, and the overall operation and maintenance efficiency is very low.

The second point: the cloud edge network connection is unreliable. Under normal circumstances, the cloud and the edge will be connected through the public network. Under the influence of some objective factors, the network between the cloud and edge may be disconnected, which poses a great challenge to the continuous operation of edge services. Because when the network is disconnected, the node will be out of cloud control, and the Pod will be expelled under the native K8s. However, in actual situations, whether it is a business restart or a machine restart, it is necessary to ensure that the edge business can continue to run. Therefore, the edge needs a certain degree of autonomy.

The third point: the cloud side-end operation and maintenance coordination is difficult. Because the edge machines are deployed inside the user firewall, the public network cannot actively connect. Therefore, some K8s native operation and maintenance capabilities that need to pull data from the center cannot be used, resulting in a lack of unified delivery, operation and maintenance, and control standards, and the security risk control of edge services and edge data is difficult.

The fourth point: the difficulty of supporting heterogeneous resources, the support of different hardware architectures, hardware specifications, communication protocols, and the challenges of providing standard uniformity based on the differentiation of heterogeneous resources, networks, and scale.

OpenYurt Edge Computing Cloud Native Platform

CNCF Edge Cloud Project OpenYurt: an intelligent open platform that extends native Kubernetes to edge computing.

1. Edge autonomy, central (cloud) management and control

11.jpg

The OpenYurt architecture is a very concise cloud-side integrated architecture. As shown in the figure above, there are blue and orange parts on the cloud. The blue part is some components of the native K8s, and the orange part is the OpenYurt component; On each node, each node on Edge Note also has a blue part and an orange part. The blue part is also a component of the native K8s or some cloud-native components, and the orange part is a component of OpenYurt.

You can see that OpenYurt's native architecture for K8s or cloud-native is zero-modified and non-intrusive. The OpenYurt project is the industry's first non-intrusive edge computing cloud native platform that enhances K8s. Other cloud-native edge computing projects may more or less make certain modifications or tailoring to K8s. This is also the biggest difference between OpenYurt and therefore ensures the standardization of OpenYurt.

  • OpenYurt can keep up with the Kubernetes version upgrade rhythm.

  • Non-intrusive concept, OpenYurt and mainstream cloud native technologies, such as ServeiceMesh, Serverless, etc., can evolve simultaneously.

OpenYurt entered the CNCF sandbox in September 2020. It is a very neutral project. One is neutrality in technology and architecture, and the other is neutrality in the operation of this project.

The quality and stability of OpenYurt are also guaranteed. It is widely used within Alibaba Group and has managed millions of cores.

12.jpg

2. How OpenYurt solves the difficulties in the integration of native and edge computing

13.jpg

  • First, edge unitization. In the large-scale business, because the edge units are relatively scattered, through edge unitization, unitized management of the unit business and closed-loop traffic management are carried out.

  • Second, the ability of marginal autonomy. In order to cope with the unreliability of the cloud-side network, by adding an autonomous capability to the edge, even when the cloud-side network is disconnected, the business can be guaranteed to continue to run.

  • Third, seamless conversion. The main purpose is to lower the threshold for using OpenYurt. By providing a seamless conversion capability, one-click switching between K8s and OpenYurt clusters is possible. One command can convert a standard K8s cluster to OpenYurt cluster, and reverse switching is also possible. , This is also the industry's first ability.

  • Fourth: It solves the problem of cloud access and active access to the edge, and provides cloud-side collaboration capabilities to solve the problem of operation and maintenance.

Below, a specific introduction for each capability.

1) Unitization capability

Provides application model capabilities in edge scenarios, mainly including the following three points:

  • NodePool can do unitized batch management of nodes.

  • Traffic management supports traffic topology management of native services.

  • UnitedDeployment provides unitized deployment of native APPs model.

14.jpg

Unitization is mainly to provide the ability to apply models in edge scenarios. In the resource node pool, a node in each region can be managed by a pool. As shown in the figure above, edge unit one, if it is a computer room in Beijing, These nodes can all be placed in the Beijing Pool, and a batch of labels and other functions can be managed uniformly for these nodes, so that the management and operation of the overall characteristics of the same batch of machines is very convenient. UnitedDeployment This resource is based on the node pool, and takes the node pool as a unit to deploy and manage the business of the node pool.

According to the example above, unit one deploys two instances and unit two deploys three instances. After submitting the configuration to the OpenYurt cluster, the deployment information can be automatically sent to the edge, and then the corresponding number of instances can be started. This is solved In order to solve the problem of independent operation of each unit during unit management, UnitedDeployment can manage each unit through a unified perspective.

2) Edge autonomy

To escort the continued operation of edge businesses, including the following two points:

  • YurtHub caches data in the cloud. When the cloud is disconnected, all system components obtain data from YurtHub.

  • Yurt-Controller-Manager solves the problem of expelling edge services caused by unstable cloud-side network connections.

15.jpg

The edge autonomy capability escorts the continuous operation of the edge business, and it can also ensure the continuous operation of the edge business when the cloud edge network is disconnected. Two components are mainly involved, one is YurtHub and the other is Yurt-Controller-Manager.

YurtHub is deployed on edge nodes. Each node is a component deployed in a containerized form. Through the above figure, we can understand the processing flow. From requesting native components such as Kubelet, KubeProxy, and Flannel, they were all directly connected to the cloud APIServer before. Now it is adjusted to connect to YurtHub first and then forward the request to APIServer.

The advantage of this is that when the request comes and the cloud edge network is not disconnected, there is a health check module that will detect the connectivity of the cloud edge network. If the cloud edge network is normal, the request will go directly to the load balancing module, and then select A cloud server forwards it and returns the result. One can return to a requester, and the other result data is cached on the local disk and persisted on the local disk.

If the cloud edge network is disconnected, the node needs to restart. When the network is disconnected, it can extract the actualized data from the local buffer through the local proxy and return it to a requester, thereby restoring the edge business and ensuring the edge business. Run continuously.

3) Seamless conversion capability

The seamless conversion capability is completed with the yurtctlconvert component. Mainly used for one-click conversion between standard K8s and OpenYurt clusters; currently supports clusters deployed by tools such as minikube, kubeadm, ack, etc.

In the case of conversion, because there are many nodes in the cluster, and each node needs to be converted into an edge node, some yurthub static pod components, kubelet parameter modification, etc. are moved to the upper edge of the edge. As shown below:

16.jpg

Through the broadcastJob of OpenKruise, another cloud native open source project of Alibaba, it is possible to ensure that a job such as a pod is run on each node to complete the conversion from each node to node. At present, our Yurtctl tool has conducted a relatively complete verification on clusters deployed by tools such as minikube, kubeadm, and ACK. More types of clusters will be supported in the future, and more interested students are welcome to contribute to the community.

4) Cloud-side collaboration capabilities

As shown below:

17.jpg

The YurttunnelServer component is deployed in the cloud. Each edge node will deploy a Yurttunnel Agent. When the Yurttunnel Agent is started, the ANP Proxy Agent module inside will establish a mutual authentication encrypted channel through the public cloud and the ANP Proxy server module. This channel is done by the gRPC protocol.

After the channel is established, when the cloud accesses the node, the iptable manager in Yurttunnel Server will import the request traffic of the node access to the Yurttunnel Server. The request Interceptor interceptor module will intercept the request and convert it into the gRPC communication protocol format. This request is forwarded to the TunnelAgent on the edge, and the TunnelAgent forwards the request to the Kubelet or pod. In this case, the cloud-side operation and maintenance collaboration capability is opened up. The native Kubernetes operation and maintenance capabilities can be run on OpenYurt clusters or cloud-side scenarios without perception. In addition, the cloud-side operation and maintenance channel is based on gRPC protocols. By compressing the tunnel bandwidth, costs can be greatly reduced, and the traffic can be reduced by up to 40% compared to native TCP communication.

OpenYurt case introduction

Case 1: Edge AI

The first case is the edge AI scenario, which is the new offline retail business of Hema Fresh.

Based on the Alibaba Cloud container service ACK@Edge product as the base, Hema Xiansheng has carried out a cloud-native transformation and upgrade, and built a "people-goods yard" digital full-link cloud, edge, and end integrated and collaborative Skyeye AI system. First, there is a cloud control plane on the cloud, and then at the edge closer to the store, the ENS node service is purchased, so that you don’t have to build a computer room for the store yourself, and then use the cloud edge integrated system to implement the access control system or modeling system , Deployed to the ENS service at the edge, and then pushed to the surveillance video stream in the store, and then analyzed after the service is carried, and the result is obtained. On the side of the control business system, the calculated result is returned to the cloud for analysis.

In the form of cloud + edge, Gigong Cloud realizes the business architecture of Cloud Sky Eye System, Alibaba Cloud Edge Convergence Node ENS, and Hema Store Physical Field at a low cost. It has strong flexibility, hybrid resource management capabilities, and cloud-native Operation and maintenance capabilities. Realize a 70% increase in the efficiency of opening new stores, and a resource cost saving of more than 50%; share computing power. As shown below:

18.jpg

Case 2: Video on the cloud

The case of video on the cloud is now particularly used all over the country, as shown in the figure:

19.jpg

From bottom to top, for example, on high-speed, lightweight gateways or standard gateways, there will be some video shooting streams. After these videos are taken, they will be transmitted to the ENS or CDN server closer to the edge for video surveillance. , Such as processing in some provincial, city and county-level computer rooms, after doing video management, aggregation and forwarding, etc., the final result is uploaded to the cloud control platform on the central cloud. Then you can do a lot of processing on the cloud control platform, such as publishing some events with AutoNavi Maps, or information notifications, etc., forming a cloud-side integrated service management platform.

The cloud-side integrated service management platform includes: application deployment/configuration, device/application status, and the realization of structured data to the cloud, greatly improving the overall operation and maintenance efficiency and management and control efficiency.

The above is this sharing about OpenYurt. If you are interested in OpenYurt, please scan the code to join our community exchange group, and visit the OpenYurt official website and GitHub project address:

Guess you like

Origin blog.51cto.com/13778063/2637501