write in front
We mentioned that the expansion and contraction of Pods can be achieved manually, but because this process is not fixed and occurs frequently, the purely manual method is not very realistic. It would be nice if the Kubernetes system could automatically scale up and down according to the changes in the current load of the Pod.
Article directory
Introduction to HPA
Kubernetes
Provides us with such a resource object: Horizontal Pod Autoscaling (Pod horizontal automatic scaling) , referred to as HPA . RC
HAP Deployment
determines whether to adjust the number of replicas of a Pod by monitoring the load changes of all Pods that are analyzed or controlled. This is the most basic principle of HPA.
HPA is designed as a controller in the kubernetes cluster. We can simply create an HPA resource object through the kubectl autoscale
command . The HPA Controller polls once every 30s by default (can be set by kube-controller-manager
the flag of --horizontal-pod-autoscaler-sync-period
), and queries the specified resource (RC Or the resource usage rate of the Pod in the Deployment), and compare it with the values and indicators set at the time of creation, so as to realize the function of automatic scaling.
When you create HPA, HPA will obtain the average value of each Pod utilization or raw value from Heapster
or ⽤户⾃定义的 RESTClient 端
, and then compare it with the metrics defined in HPA, and calculate the specific value that needs to be scaled and carried out corresponding action. Currently, HPA can obtain data from two places:
- Heapster: only supports CPU usage
- Custom monitoring: We will explain how to use this part in the following monitoring course
We will now introduce to you the method of obtaining monitoring data from Heapster for automatic expansion and contraction, so first we have to install Heapster. In the previous article on building a cluster in kubeadm, Heapster has actually been related by default. The images have been pulled to the node, so we only need to deploy it next.
We are using Heapster version 1.4.2 here, go to Heapster's github page
Save the yaml file under this directory to our cluster, and then use the kubectl
command line tool to create it. After the creation is complete, if you need to see the monitoring chart in the Dashboard, we also need to configure it in the Dashboard on ours heapster-host
.
HPA auto scaling
Let's create a Deployment managed Nginx Pod
, and then use HPA for automatic scaling. The YAML file that defines the Deployment is as follows:(hap-deploy-demo.yaml)
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: hpa-nginx-deploy
labels:
app: nginx-demo
spec:
revisionHistoryLimit: 15
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Then create the Deployment:
$ kubectl create -f hpa-deploy-demo.yaml
现在我们来创建⼀个 HPA ,可以使⽤ kubectl autoscale 命令来创建:
$ kubectl autoscale deployment hpa-nginx-deploy --cpu-percent=10 --min=1 --max=10
deployment "hpa-nginx-deploy" autoscaled
···
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
hpa-nginx-deploy Deployment/hpa-nginx-deploy 10% 0% 1 10 1
3s
This command creates an hpa-nginx-deploy
HPA with associated resources, with a minimum pod replica count of 1 and a maximum of 10. HPA will dynamically increase or decrease the number of pods according to the set CPU usage (10%).
Of course, besides using kubectl autoscale
commands to create, we can still create HPA resource objects in the form of YAML files. If we don't know how to write it, we can look at the HPA YAML file created by the command line above:
$ kubectl get hpa hpa-nginx-deploy -o yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
creationTimestamp: 2017-06-29T08:04:08Z
name: nginxtest
namespace: default
resourceVersion: "951016361"
selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/nginxtest
uid: 86febb63-5ca1-11e7-aaef-5254004e79a3
spec:
maxReplicas: 5 //资源最⼤副本数
minReplicas: 1 //资源最⼩副本数
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment //需要伸缩的资源类型
name: nginxtest //需要伸缩的资源名称
targetCPUUtilizationPercentage: 50 //触发伸缩的cpu使⽤率
status:
currentCPUUtilizationPercentage: 48 //当前资源下pod的cpu使⽤率
currentReplicas: 1 //当前的副本数
desiredReplicas: 2 //期望的副本数
lastScaleTime: 2017-07-03T06:32:19Z
pressure test
Now we can create a YAML-based HPA description file based on the above YAML file. Now let's increase the load for testing, let's create one busybox
, and loop through the services created above.
$ kubectl run -i --tty load-generator --image=busybox /bin/sh
If you don't see a command prompt, try pressing enter.
/ # while true; do wget -q -O- http://172.16.255.60:4000; done
As you can see in the image below, HPA has started to work.
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
hpa-nginx-deploy Deployment/hpa-nginx-deploy 10% 29% 1 10
27m
At the same time, we check the number of copies of related resources hpa-nginx-deploy
, and the number of copies has changed from 1 to 3.
$ kubectl get deployment hpa-nginx-deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hpa-nginx-deploy 3 3 3 3 4d
At the same time, looking at HPA again, due to the increase in the number of replicas, the utilization rate has also remained at around 10%.
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
hpa-nginx-deploy Deployment/hpa-nginx-deploy 10% 9% 1 10 3
5m
At the same time, let's turn it off busybox
to reduce the load, and then wait for a while to observe the HPA and Deployment objects
$ kubectl get hpa
NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE
hpa-nginx-deploy Deployment/hpa-nginx-deploy 10% 0% 1 10 4
8m
$ kubectl get deployment hpa-nginx-deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
hpa-nginx-deploy 1 1 1 1 4d
You can see that the number of copies has changed from 3 to 1.
However, the current HPA only has the indicator of CPU usage, which is not very flexible. In the following courses, we will automatically expand and shrink the Pod according to our custom monitoring.