Kubernetes' Pod automatic expansion and shrinking and HPA principle

In actual business scenarios, we often encounter scenarios where a service needs to be expanded (for example: test stress test, e-commerce spike, big promotion, or due to resource constraints, workload reduction, etc., we need to check the number of service instances. Expansion operations). In Kubernetes, the Scale mechanism of Deployment/RC can be used to facilitate automated scaling operations.

Kubernetes scaling

Kubernetes' expansion and contraction of Pod is divided into two types: manual and automatic

1. Manual mode

In manual mode, the number of Pod copies is set for a Deployment/RC through the kubectl scale command. It can be done with one click.

Example:

cat scale.yaml

---

apiVersion: apps/v1

kind: Deployment

metadata:

 name: ops-nginx

 namespace: ops

spec:

 selector:

   matchLabels:

     app: ops-nginx

 replicas: 1

 template:

   metadata:

     labels:

       app: ops-nginx

   spec:

     imagePullSecrets:

     - name: cd-registry

     containers:

     - image: harbor.ttsingops.com/nginx/nginx:1.16.0

       name: ops-nginx

kubectl apply -f scale.yaml

kubectl get po -n ops

1.png

#Expand

kubectl scale deployment ops-nginx -n ops --replicas 5

2.png

#Shrink operation

kubectl scale deployment ops-nginx -n ops --replicas 2

kubectl get po -n ops

3.png

2. Automatic mode

The user needs to expand and contract according to a certain performance index or based on a Prometheus custom index, and the system will automatically change and adjust according to the performance index within this range.

Introduction to HAP

 HPA (Horizontal Pod Autoscaler, Pod horizontal autoscaler) was introduced from Kubernetes V1.1. It is used to realize the function of automatic Pod expansion and contraction based on CPU usage. The HPA controller is based on the duration defined by the Master's kube-controller-manager service startup parameter--horizontal-pod-autoscaler-sync-period, periodically detects the CPU usage of the target Pod, and checks the RC or Deployment when the conditions are met. The number of Pod copies is adjusted to meet the user-defined average Pod CPU usage.

Version evolution of HPA:

HPA currently supports three major versions: autoscaling/v1, autoscaling/v2beta1 and autuscaling/v2beta2.

What is the difference between these three versions?

The autoscaling/v1 version only supports Pod horizontal scaling with one CPU indicator.

Autoscaling/v2beta1 adds support for custom indicators. In addition to the indicators exposed by cadvisor, it also supports custom indicators, such as QPS provided by a third party, or expansion based on some other resources, which is to support some third-party components. .

autoscaling/v2beta2 adds external indicator support

The principle of HPA automatic expansion and contraction?

 The Metrics Server in Kubernetes continuously collects the metric data of all Pod copies. The HPA controller obtains these data through the Metrics Server API (Heapster's API or aggregation API, which has been slowly obsolete and uses Metrice Server), and calculates based on the defined scaling rules to obtain the number of target Pod copies. When the number of copies of the target Pod is different from the number of current copies, the HPA controller initiates a scale operation to the Pod's copy controller (RC/Deployment), adjusts the number of Pod copies, and completes the scaling operation.

4.png

Thinking:

If the CPU usage of a Pod suddenly increases within a certain period of time and then immediately decreases, wouldn't the Pod expand and shrink frequently? That is, the number of copies is constantly adjusted. There will be a cooling cycle here. What is the cooling time after each expansion and contraction.

In HPA, the default expansion cooling cycle is 3 minutes, and the shrinking cooling cycle is 5 minutes.

The cooling time can be set by adjusting the startup parameters of the kube-controller-manager component:

--horizontal-pod-autoscaler-downscale-delay expansion cooling

--horizontal-pod-autoscaler-upscale-delay shrink cooling

Example:

Let's take an example, and then put pressure on the pod for automatic expansion.

cat auto_scale.yaml

---

apiVersion: apps/v1

kind: Deployment

metadata:

 name: auto-nginx

 namespace: ops

spec:

 selector:

   matchLabels:

     app: auto-nginx

 replicas: 1

 template:

   metadata:

     labels:

       app: auto-nginx

   spec:

     imagePullSecrets:

     - name: cd-registry

     containers:

     - image: harbor.ttsingops.com/nginx/nginx:1.16.0

       name: auto-nginx

       resources:

         requests:

           cpu: 200m

       ports:

       - containerPort: 80

---

#Service

apiVersion: v1

kind: Service

metadata:

 name: auto-nginx

 namespace: ops

spec:

 type: NodePort

 ports:

 - port: 8088

   protocol: TCP

   targetPort: 80

   nodePort: 38088

 selector:

   app: auto-nginx

kubectl apply -f auto_scale.yaml

kubectl get svc,pod -n ops -o wide

5.png

cat auto_nginx_hpa.yaml

apiVersion: autoscaling/v1

kind: HorizontalPodAutoscaler

metadata:

 name: auto-nginx-hpa

 namespace: ops

spec:

 scaleTargetRef:

   apiVersion: apps/v1

   kind: Deployment

   name: auto-nginx

 minReplicas: 1

 maxReplicas: 8

 targetCPUUtilizationPercentage: 50

kubectl apply -f auto_nginx_hpa.yaml

kubectl get  hpa -n ops

6.png

使用apache的ab压测工具进行施压

可以在node任意节点安装ab工具

yum install -y httpd-tools

ab -n 10000000  -c 10000 http://192.168.1.211:38088/index.html

#因为我把Service映射到NodePort

#查看hpa情况

kubectl get hpa -n ops

7.png

#查看Pod情况

kubectl get pods -n ops

8.png

#查看events信息

kubectl get events -n ops

9.png

#等待几分钟后,再次查看Pod缩容情况

kubectl get po -n ops

10.png

思考:

Kubernetes是如何对Pod的副本数量进行扩缩容的呢?

官网已经有详细解释:就是根据当前CPU指标和所需CPU指标进行相除。

For example: the current CPU index is 200m, and the required index value is 100m, the number of copies will be doubled because 200.0/100.0=2.

If the current value is 50m, the number of copies will be halved, because 50.0/100.0 = 0.5.

11.png

For details, please refer to the official website explanation:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

problem:

1. When HPA auto-scaling Pod is in progress, auto-scaling cannot be performed

Need to install metrics-server component

Please refer to 5. Install metrics-server in << Ansbile Deploying Kubernetes 1.16.10 Cluster >>

Configuration steps

【Reference Materials】

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/


Guess you like

Origin blog.51cto.com/3388803/2534750