1. Knowledge about HPA
HPA (Horizontal Pod Autoscaling) Pods scale automatically horizontally. Kubernetes has an HPA resource. HPA can automatically scale the number of Pods in a Replication Controller, Deployment, or Replica Set according to the CPU utilization.
(1) HPA periodically detects the CPU usage of the Pod based on the duration defined by the kube-controller-manager service startup parameter horizontal-pod-autoscaler-sync-period on the Master (the default is 30 seconds).
(2) Like the previous RC and Deployment, HPA also belongs to a kind of Kubernetes resource object. By tracking and analyzing the load changes of all target Pods controlled by RC, it is determined whether it is necessary to adjust the number of copies of the target Pod in a targeted manner. This is the implementation principle of HPA.
(3) The metrics-server also needs to be deployed in the cluster, and it can provide measurement data externally through the resource metrics API.
2. Deployment and application of HPA
2.1 Perform HPA deployment settings
//Upload the metrics-server.tar image package to the /opt directory on all Node nodes
cd /opt/
docker load -i metrics-server.tar #Execute kubectl apply -f components.yaml
on the main master node
vim components.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-use-node-status-port
- --kubelet-insecure-tls
image: registry.cn-beijing.aliyuncs.com/dotbalo/metrics-server:v0.4.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
periodSeconds: 10
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
#After the deployment is complete, you can use the command to monitor the resource usage of the pod
kubectl top pods
kubectl top nodes
2.2 Test demonstration of HPA scaling
(1) Create a pod resource for testing
kubectl create deployment hpa-deploy --image=nginx:1.14 --replicas=3 -o yaml >hpa-test.yaml
vim hpa-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: hpa-deploy
name: hpa-deploy
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: hpa-deploy
template:
metadata:
labels:
app: hpa-deploy
spec:
containers:
- image: nginx:latest
name: nginx-hpa
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
resources:
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: hpa-deploy
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: hpa-deploy
(2) Create an HPA controller for resource limitation and scaling management
Use the kubectl autoscale command to create an HPA controller, set the cpu load threshold to 50% of the requested resources, specify the minimum number of load nodes to be 1, and the maximum number of load nodes to be 10 kubectl autoscale deployment hpa-deploy --cpu-percent=50
--min=1 --max=10
(3) Enter one of the pod containers to simulate an infinite loop
kubectl exec -it hpa-deploy-5dfd5cf57b-7f4ls bash
while ture
> do
> echo this is hpa test
> done
Open another terminal for hpa monitoring:
When HPA expands, the number of load nodes will increase faster; but when recycling, the number of load nodes will decrease slower
To prevent network fluctuations and other reasons during peak business hours, if the recovery strategy is more active, the K8S cluster may think that the access traffic has decreased and rapidly shrink the number of load nodes, while the remaining load nodes cannot withstand the high load pressure and cause collapse, thereby affecting the business.
3. Namespace resource constraints
kubectl explain ResourceQuota
3.1 Quota Limits on Computing Resources
Apiversion: V1
Kind: Resourcequota #Use Resourcequota Resource Type
Metadata:
name: Compute-Resources
name: Spark-Cluster #Specify
:
Hard:
pods: 20 "20" #Set the maximum number of pods
in requests.cpu: "2"
requests.memory: 1gi
limits.cpu: "4"
limits.memory: 2gi: 2gi
Taking the above as an example, limit computing resources for the created namespace sapark-cluster. First, the maximum number of pods in the namespace is limited to 20, and the reserved cpu and maximum limited cpu are two and four, respectively. The reserved memory and the maximum limit memory are 2GI and 4GI respectively.
3.2 Quota limit on the number of configuration objects
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-counts
namespace: spark-cluster
spec:
hard:
configmaps: "10"
persistentvolumeclaims: "4" #Set the maximum number of pvc
replicationcontrollers: "20" #Set the maximum number of rc
secrets: "10"
services: "10"
services.loadb alancers: "2"
Take the above as an example, this configuration is to limit the resource objects existing in the namespace.
If the Pod does not set requests and limits, the maximum resources of the current namespace will be used; if the namespace is not set, the maximum resources of the cluster will be used.
K8S will limit the resources used by Pod according to the limits. When the memory exceeds the limits, cgroups will trigger OOM (out of memory).
Here you need to create a LimitRange resource to set the maximum default value that the Pod or the Container in it can use the resource:
apiVersion: v1
kind: LimitRange #Use LimitRange resource type
metadata:
name: mem-limit-range
namespace: test #You can add a resource limit to the specified namespace
spec:
limits:
- default: #default is the value of limit
memory: 512Mi
cpu: 500m
defaultRequest: #defaultRequest is the value of request me
mory: 256Mi
cpu: 100m
type: Container #Type supports Container, Pod, PVC