[Cloud native] HPA of k8s, namespace resource limitation

 1. Knowledge about HPA

 HPA (Horizontal Pod Autoscaling) Pods scale automatically horizontally. Kubernetes has an HPA resource. HPA can automatically scale the number of Pods in a Replication Controller, Deployment, or Replica Set according to the CPU utilization.

(1) HPA periodically detects the CPU usage of the Pod based on the duration defined by the kube-controller-manager service startup parameter horizontal-pod-autoscaler-sync-period on the Master (the default is 30 seconds).

(2) Like the previous RC and Deployment, HPA also belongs to a kind of Kubernetes resource object. By tracking and analyzing the load changes of all target Pods controlled by RC, it is determined whether it is necessary to adjust the number of copies of the target Pod in a targeted manner. This is the implementation principle of HPA.

(3) The metrics-server also needs to be deployed in the cluster, and it can provide measurement data externally through the resource metrics API.
 

2. Deployment and application of HPA

2.1 Perform HPA deployment settings

//Upload the metrics-server.tar image package to the /opt directory on all Node nodes
cd /opt/
docker load -i metrics-server.tar #Execute kubectl apply -f components.yaml
 
on the main master node

vim    components.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  - configmaps
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
        image: registry.cn-beijing.aliyuncs.com/dotbalo/metrics-server:v0.4.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          periodSeconds: 10
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

#After the deployment is complete, you can use the command to monitor the resource usage of the pod
kubectl top pods
 
kubectl top nodes

 

 2.2 Test demonstration of HPA scaling

 (1) Create a pod resource for testing

 kubectl create deployment hpa-deploy --image=nginx:1.14 --replicas=3 -o yaml >hpa-test.yaml
 
vim hpa-test.yaml
 
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: hpa-deploy
  name: hpa-deploy
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hpa-deploy
  template:
    metadata:
      labels:
        app: hpa-deploy
    spec:
      containers:
      - image: nginx:latest
        name: nginx-hpa
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 200m
 
---
apiVersion: v1
kind: Service
metadata:
  name: hpa-deploy
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: hpa-deploy

(2) Create an HPA controller for resource limitation and scaling management 

Use the kubectl autoscale command to create an HPA controller, set the cpu load threshold to 50% of the requested resources, specify the minimum number of load nodes to be 1, and the maximum number of load nodes to be 10 kubectl autoscale deployment hpa-deploy --cpu-percent=50
 
--min=1 --max=10

(3) Enter one of the pod containers to simulate an infinite loop 

kubectl exec -it hpa-deploy-5dfd5cf57b-7f4ls bash
while ture
> do
> echo this is hpa test
> done

Open another terminal for hpa monitoring: 

 

When HPA expands, the number of load nodes will increase faster; but when recycling, the number of load nodes will decrease slower 

 To prevent network fluctuations and other reasons during peak business hours, if the recovery strategy is more active, the K8S cluster may think that the access traffic has decreased and rapidly shrink the number of load nodes, while the remaining load nodes cannot withstand the high load pressure and cause collapse, thereby affecting the business.
 

3. Namespace resource constraints

kubectl explain ResourceQuota
 

3.1 Quota Limits on Computing Resources 

Apiversion: V1
Kind: Resourcequota #Use Resourcequota Resource Type
Metadata:
  name: Compute-Resources
  name: Spark-Cluster #Specify
:
  Hard:
    pods: 20 "20" #Set the maximum number of pods
    in requests.cpu: "2"
    requests.memory: 1gi
    limits.cpu: "4"
    limits.memory: 2gi: 2gi

Taking the above as an example, limit computing resources for the created namespace sapark-cluster. First, the maximum number of pods in the namespace is limited to 20, and the reserved cpu and maximum limited cpu are two and four, respectively. The reserved memory and the maximum limit memory are 2GI and 4GI respectively. 

3.2 Quota limit on the number of configuration objects

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-counts
  namespace: spark-cluster
spec:
  hard:
    configmaps: "10"
    persistentvolumeclaims: "4" #Set the maximum number of pvc
    replicationcontrollers: "20" #Set the maximum number of rc
    secrets: "10"
    services: "10"
    services.loadb alancers: "2"

Take the above as an example, this configuration is to limit the resource objects existing in the namespace.

If the Pod does not set requests and limits, the maximum resources of the current namespace will be used; if the namespace is not set, the maximum resources of the cluster will be used.
K8S will limit the resources used by Pod according to the limits. When the memory exceeds the limits, cgroups will trigger OOM (out of memory). 

Here you need to create a LimitRange resource to set the maximum default value that the Pod or the Container in it can use the resource: 
 

apiVersion: v1
kind: LimitRange #Use LimitRange resource type
metadata:
  name: mem-limit-range
  namespace: test #You can add a resource limit to the specified namespace
spec:
  limits:
  - default: #default is the value of limit
      memory: 512Mi
      cpu: 500m
    defaultRequest: #defaultRequest is the value of request me
      mory: 256Mi
      cpu: 100m
    type: Container #Type supports Container, Pod, PVC

Guess you like

Origin blog.csdn.net/zhangchang3/article/details/131796423