Serverless battle with the container around the corner? With elastic stretch is not the same

fileAuthor | Ali cloud container technology specialists Mo source  

article finishing from Mo from August 31  K8s & cloudnative meetup Shenzhen-field speech content. **** concern "Alibaba Cloud native" public number, keyword **** reply "data" , you can get 2019 annual meetup events PPT collection and K8s most complete knowledge map.

REVIEW : Serverless and Autoscaling content in recent years, the majority of developers are very concerned about. Some say Serverless vessel is 2.0 one day and Serverless container will be a decisive battle, a winner. In fact, container and Serverless can coexist and complement, especially in Autoscaling related scenarios, Serverless can be perfectly compatible with the container, the container make up a scene using simple, speed, cost of shortcomings, in this paper will introduce how vessel under the principle of elasticity scenarios, programs and challenges, as well as container Serverless help solve these problems.

When we talk about "elastically stretchable" when

When we talk about "elastically stretchable" when are we talking about? "Elastically stretchable" have different meanings in different roles for the team, which is elastically stretchable charm.

We start from a resource graph

This picture is a picture often cited Elaborating elastically stretchable problem, it represents the relationship between the actual capacity of the cluster resources and applications required capacity.

  • The red curve shows the actual capacity required for the application, since the application amount of resources in terms of application will be much smaller than the node, so the curve is relatively smooth;
  • The broken line indicates the green actual resource cluster's capacity, at this time indicates that the inflection point of polyline capacity manual adjustment, such as increased node or a node is removed, because the capacity of a single node of a fixed resource, and relatively large, so as to fold line the Lord.

file

First, we look at the first piece of yellow grid area to the left, this area represents a cluster's capacity can not meet the required capacity of the business, in the actual scene, usually accompanied by a lack of resources can not be scheduled due the emergence of such phenomena Pod .

Middle grid area, the capacity of the cluster is much higher than the actual resource capacity required, there will be a waste of resources at this time, actual performance is usually uneven load distribution node, no node portion above the load scheduling, while other nodes the load is relatively high.

The right side of the grid area indicates that the surge of peak capacity, we can see the curvature of the peak is very steep to reach, this scenario is usually due to a surge in traffic, the scene in the unconventional high-volume capacity planning tasks, surge the peak flow to the operation and maintenance of the students reaction time is very short, if not handled properly it may lead to an accident.

Elastically stretchable for the different roles of staff, has a different meaning:

  • Developers hope that by stretching the elastic highly available applications are protected;
  • Operation and maintenance personnel want to reduce administrative costs by elastically stretchable infrastructure;
  • Architects want flexible resilient architecture to deal with unexpected surge peak by elastically stretchable.

Elastically stretchable There are many different components and programs, to choose their own business needs of the program is the first step before the execution ground.

Kubernetes elastic scalability interpretation

Kubernetes elastically stretchable related components

file

Kubernetes elastically stretchable components can interpret the two dimensions: one is the stretching direction, a telescopic object.

From the stretching direction, it is divided into horizontal and vertical. Stretching from the object, into node Pod. Then this quadrant are expanded, it becomes as follows Class 3 components:

  1. cluster-autoscaler, horizontal scaling node;
  2. HPA & cluster-proportional-autoscaler, Pod horizontal scaling;
  3. vertical pod autoscaler & addon resizer, Pod longitudinal stretching.

Cluster-Autoscaler wherein the HPA is elastically stretchable component developers most commonly used in combination. HPA is responsible for the level of telescopic container, the level of node Cluster-Autoscaler responsible for stretching. Many developers will have this question: Why is resilient and elastic feature requires a refined into so many components separately, do not directly set a threshold, automatic water level management of a cluster?

Kubernetes of elastically stretchable challenge

file

Learn Kubernetes scheduling ways to help developers better understand Kubernetes elastically stretchable design philosophy. In Kubernetes, the minimum unit of scheduling is a Pod, Pod node is scheduled to satisfy the condition according to the scheduling strategies, which include resources matching relation, affinity and anti-affinity, etc., wherein the resource matching relationship computing is a core element in the scheduling.

There are usually four concepts and resources related to:

  • Capacity represents the total amount of capacity can be allocated to a node;
  • Limit represents a Pod total resources that can be used;
  • Pod Request represents a scheduled resource on a space occupied;
  • Used Pod represent a real resource use.

After learning the four basic concepts and usage scenarios, we will look at Kubernetes elastically stretchable three major problems:

  1. Capacity planning bomb
    remember before did not use containers, capacity planning is how to do it? Generally in accordance with the application to the dispensing machine, for example, application A needs two 4C8G machine, the application B needs four 8C16G machines, machines and machine application A application B are independent, mutually non-interfering. To the container scene, most developers do not care about the underlying resources, then this time where capacity planning to go out?

In Kubernetes it is by  Request and  Limit performed manner, Request that the applicant resource value, Limit represents the limit value of the resource. Since  Request and  Limit is capacity planning of concepts, then this represents the actual computing resources according to the rules  Request and  Limit was more accurate. For each node to reserve resources on the threshold value, it is likely to cause a small reserve node can not satisfy the reservation scheduling, dispatching large nodes and endless scene.

  1. The percentage debris trap
    in a Kubernetes cluster usually includes not only a specification of the machine. For different scenarios, different needs, the machine's configuration, capacity will have very large differences may be, so the percentage of cluster would have a very large stretch of confusing.

Assuming that the machine 16C32G two different specifications of the machine 4C8G the existence of our cluster, for 10% of the resource reservation, this is what we stand for two specifications are completely different. Especially in the volume reduction scenarios, usually in order to ensure that the cluster volume reduction is not in a state of shock, we will be a volume reduction of a node to node, then how to determine whether the current node is in a state based on the percentage volume reduction is particularly important. At this point if large-sized machine has a lower capacity utilization rate is judged to shrink, it is likely to cause fighting hunger after the node after volume reduction, container rescheduled. If you add a condition to determine the priority node shrink small capacity configuration, it is possible to cause a large number of redundant after volume reduction of resources, and ultimately left the cluster may all 巨石nodes.

  1. Resource utilization dilemma
    whether resource utilization clusters can really represent the current state of the cluster it? When a Pod resource utilization is very low when the occupation does not mean that the resources he can apply. Most of the production cluster, resource utilization will not be maintained at a very high level, but in terms of scheduling, resource scheduling water level should be maintained at a relatively high level. So as to not only ensure the stability of available clusters, not too waste of resources.

If not set  Request and  Limit, while the overall cluster resource utilization is high, which means what? This means that all of the Pod are being conducted in real unit load scheduling, there is a very serious competition with each other, and simply adding nodes also does not solve the problem, because for a scheduled Pod, in addition to manual scheduling and expulsion, there is no way this can be removed from the Pod node high load. What if we set up  Request with  Limit and when resource utilization and very high node shows what? It is a pity that in most of the scenes are impossible because of different different application workloads will vary at different times utilization of resources, the situation is a high probability that the threshold has not triggered a cluster has been set You can not schedule a Pod.

In the understanding of three major issues Kubernetes elastically stretchable, we look at what Kubernetes solution is?

Kubernetes resilient and elastic design philosophy

file

Kubernetes design philosophy is to stretch the elastically stretchable layer and resource scheduling into telescoping layers. The index layer is responsible for scheduling, the scheduling unit telescopic threshold, and the resource scheduling unit responsible for meeting the telescopic layer resource requirements.

Is generally carried out in the scheduling level by way of HPA horizontal scaling Pod of the HPA use and our understanding of the traditional sense of the elastically stretchable is very close and similar indicators by setting the determination, determination thresholds horizontal scaling.

In the current mainstream program resources layer is horizontally scalable nodes through cluster-autoscaler. When there is not caused because of insufficient resources Pod scheduling, cluster-autoscaler will try to stretch from the configuration group, select a group to meet the scheduling needs and automatically added to the group instance, when the instance is registered to start after Kubernetes, kube- scheduler will re-trigger Pod scheduling, will not be scheduled before Pod scheduled on newly generated node, thus completing the expansion of the whole link.

Also when the volume reduction, scheduling level according to the threshold current will be provided resources utilization comparison, to achieve volume reduction level Pod. When the scheduling grant lowered to the node when the resource layer Pod volume reduction threshold, this time will be drained Cluster-Autoscaler low percentage scheduling node, after the completion of the drainage volume reduction will be a node complete contraction of the entire link.

Achilles heel Kubernetes resilient and elastic solutions

The classic Kubernetes elastically stretchable case

file

This map is a very classic case of elastic stretch, it can represent a scene most of the online business. The initial architecture of the Deployment application is a, the following two Pod, the application access layer is exposed outside through the Ingress Controller manner, we set the telescopic policy applied to: a single Pod QPS reached 100, the expansion is performed, a minimum of two Pod, a maximum of 10 Pod.

file

HPA controller will continue in rotation alibaba-cloud-metrics-adapter, to get the current QPS index Ingress Gateway routes. When the flow Ingress Gateway reaches QPS threshold, HPA controller triggers number Pod change Deployment of; until after more than a cluster of the total application capacity of the Pod, cluster-autoscaler selects the appropriate telescopic group, the corresponding pop-Node, the carrier is not scheduling Pod.

Such a classic case of elastic stretch on the analyzed, then in the actual development process, what problems you might encounter it?

The classic Kubernetes elastically stretchable shortcomings and solution

file

The first is the issue of expansion of the delay, the community is by creating a standard mode, the release of ECS way, the delay in the expansion of around 2min-2.5min, while Ali cloud independent speed mode by creating, stop, start manner to achieve, only charge stored shutdown at no cost calculations. Elastic efficiency can be achieved by more than 50 percent of very low prices.
file
file
Further complexity is cluster-autoscaler around, but the problem, you want to make good use of cluster-autoscaler, require in-depth understanding of some internal mechanism of cluster-autoscaler, or is likely to cause volume reduction can not be ejected or not the scene.
file
For most developers, it works cluster-autoscaler of a black box, and the cluster-autoscaler far the best way to troubleshoot still view the log. Once the cluster-autoscaler appear abnormal operation of the latter due to the developer caused by configuration errors such as unexpected stretch, then more than 80% of the developers is difficult to own error correction.

Ali cloud container services team has developed a kubectl plugin, can provide cluster-autoscaler deeper observability, you can view the current capacity telescopic stage cluster-autoscaler where the elastically stretchable and automatic error correction.

Although few core problems currently experiencing, not a final straw breaks the camel. But we've been thinking, if there are other ways to make elastically stretchable easier to use and more efficient?

Achilles Martin boots - Serverless Autoscaling

file
Resource layer scalable core problem is that the high cost of learning, troubleshooting difficulties, poor timeliness. When Serverless look back, we can find these problems happen to be characteristics and advantages of Serverless, is there a way to get Serverless become elastic layer program Kubernetes resources it?

Serverless Autoscaling assembly - virtual-kubelet-autoscaler

file

Ali cloud container services team has developed a virtual-kubelet-autoscaler, an implementation component serverless autoscaling in Kubernetes in.
file

When there was not scheduled Pod time, virtual-kubelet responsible for carrying real load, can be understood as a virtual node, with infinite capacity. When Pod scheduled on virtual-kubelet, will be started by flyweight Pod example ECI. ECI's current starting time in the 30s, the program will generally pulled from the schedule to start operating within one minute.
file

Similar cluster-autoscaler, virtual-kubelet-autoscaler analog scheduling mechanism is also required to determine whether Pod may be processed and the real bearer, concerned compared to cluster-autoscaler However, there is a difference:

  1. virtual-kubelet-autoscaler scheduling simulation object is to increase the scheduling strategy of Pod Template not Node Template.
  2. Core virtual-kubelet-autoscaler virtual-kubelet choice is to carry the load, once the simulation schedule Pod successfully bound to the virtual-kubelet, Pod's life cycle management, troubleshooting, and the like and there is no difference between conventional Pod, no a black box to troubleshoot the problem.

virtual-kubelet-autoscaler not a "silver bullet"

virtual-kubelet-autoscaler not intended to replace the cluster-autoscaler, virtual-kubelet-autoscaler advantage is the use of a simple, highly elastic high concurrency, billing according to the amount needed. But at the same time part of the expense of compatibility, the current cluster-pi, coredns and other mechanisms to support also is not perfect, just a little configuration of virtual-kubelet-autoscaler is and cluster-autoscaler compatible. virtual-kubelet-autoscaler scene is particularly suitable for large data off-line tasks, CI / CD operation, the burst-line loads.

At last

serverless autoscaling has gradually become an important part Kubernetes elastically stretchable, when serverless autoscaling basic compatibility padded, serverless easy to use, no operation and maintenance cost savings features will form the perfect complement to the Kubernetes, to achieve a new leap elastically stretchable Kubernetes .

Guess you like

Origin www.cnblogs.com/alisystemsoftware/p/11463094.html