[Cloud Native] pod controller of k8s

 foreword

Pod is the smallest deployment unit that can be created and managed in a Kubernetes cluster. So tools are needed to operate and manage their life cycle, and controllers are needed here.

The Pod controller is provided by the kube-controller-manager component of the master. Common such controllers include Replication Controller, ReplicaSet, Deployment, DaemonSet, StatefulSet, Job, and CronJob, etc., which manage Pod resource objects in different ways.
 

1. Knowledge about the pod controller

 1.1 The role of the pod controller

 Pod controller, also known as workload, is the middle layer used to manage pods to ensure that pod resources meet the expected status. When pod resources fail, they will try to restart them. If the restart strategy fails, pod resources will be recreated.

According to the way pods are created and installed, they can be divided into two categories:

Autonomous pod : The pod created directly by kubernetes will disappear after being deleted, and will not be rebuilt. The
pod created by the controller: The pod created by the controller will be automatically rebuilt after being deleted.

 The relationship between the controller and the pod:

controllers: Pod objects that manage and run containers on the cluster, and pods are associated through label-selectors.
The pod implements the operation and maintenance of the application through the controller, such as scaling and upgrading.


 

1.2 Multiple types of pod controllers 

1) ReplicaSet : Create a specified number of pod copies on behalf of users to ensure that the number of pod copies meets the expected state, and supports rolling automatic expansion and contraction functions.
ReplicaSet mainly consists of three components:
   1) The number of pod copies expected by the user
   2) The label selector, which determines which pod is managed by itself
   3) When the number of existing pods is insufficient, new pods will be created according to the pod resource template

to help users manage stateless pod resources and accurately reflect the target number defined by the user. However, RelicaSet is not a directly used controller, but uses Deployment.

(2) Deployment : It works on ReplicaSet and is used to manage stateless applications . It is currently the best controller. It supports rolling update and rollback functions, and also provides declarative configuration.
The two resource objects, ReplicaSet and Deployment, gradually replace the role of the previous RC.

(3) DaemonSet : used to ensure that each node in the cluster only runs a specific pod copy, usually used to implement system-level background tasks. For example, ELK service
features: the service is stateless and
the service must be a daemon process

(4) StatefulSet : manage stateful applications

(5) Job : Exit as soon as it finishes, no need to restart or rebuild

(6) Cronjob : Periodic task control, no need for continuous background operation

1.3 Comparison between stateful and stateless pod containers 

(1) There are differences between stateful instances 
. Each instance has its own uniqueness and different metadata, such as the
unequal relationship between etcd and zookeeper instances, and applications that rely on external storage 


(2) Stateless instance 
deployment considers all pods to be the same,
no need to consider the order requirements,
no need to consider which node node to run on,
and can expand and shrink at will

 

2.1 Deployment controller

Deploy stateless applications
Manage Pod and ReplicaSet
with functions such as online deployment, replica setting, rolling upgrade, rollback, etc.
Provide declarative updates, such as only updating a new image
Application scenario: web service

vim nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx    
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15.4
        ports:
        - containerPort: 80

kubectl create -f nginx-deployment.yaml

kubectl get pods,deploy,rs
 

2.2 Application of SatefulSet Controller 

 StatefulSet is a workload API object used to manage stateful applications.

StatefulSet is used to manage the deployment and expansion of a Pod collection, and provide persistent storage and persistent identifiers for these Pods.

Similar to Deployment, StatefulSet manages a set of Pods based on the same container specification. But unlike Deployments, StatefulSets maintain a sticky ID for each of their Pods . These Pods are created based on the same protocol, but are not interchangeable: each Pod has a permanent ID no matter how it is scheduled.

If you want to use storage volumes to provide persistent storage for your workloads, you can use StatefulSets as part of your solution. Although individual Pods in a StatefulSet can still fail, persistent Pod identifiers make it easier to match existing volumes with new Pods that replace failed Pods.

 

StatefulSets are valuable for applications that need to meet one or more of the following requirements:

Stable, unique network identifier.
Stable, durable storage.
Orderly and graceful deployment and scaling.
Orderly, automatic rolling updates.


In the above description, "stable" means that the entire process of Pod scheduling or rescheduling is persistent. If the application does not require any stable identifiers or orderly deployment, deletion, or scaling, the application should be deployed using a workload provided by a set of stateless replica controllers, such as a Deployment or ReplicaSet may be more suitable for your stateless application deployment needs.
 

2.2 SatefulSet Controller

1 Headless Service named nginx-svc is used to control the network domain name.
2 A StatefulSet named nginx-sts has a Spec that states that nginx containers will be started in independent 3 Pod replicas.
3 volumeClaimTemplates will provide stable storage through the PersistentVolumes prepared by the PersistentVolume preparation program

 

Demonstration of creating a case 

apiVersion: v1
kind: Service
metadata:
  name: nginx-svc      service name  
spec:
  ports:
  - port: 80
    targetPort: 80
  clusterIP: None The clusterIp of the headless service is None
  selector:
    app: nginx-sts Pods with this label have this service


---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nginx-sts
spec:
  replicas: 3
  serviceName: "nginx-svc" #declare which Headless Service it belongs to. selector
  :
    matchLabels:
      app: nginx-sts and the following labels must correspond to
  template:     define pod template
    metadata:
      labels: app: ng inx-sts and the above labels must correspond to     spec:       containers:       - image: nginx:1.14         imagePullPolicy: IfNotPresent         name: nginx-test         ports:         - containerPort: 80           protocol: TCP         volumeMounts:         - name: www
       










          mountPath: /usr/share/nginx/html
 volumeClaimTemplates: #Can be regarded as a pvc template --- when creating a pod, pvc is automatically created and pv is requested 
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "nfs-client-storageclass"  #Storage class name, changed to existing       resources in the cluster:         re quests:           storage: 2Gi  


 

 

 

Demonstration of expansion and shrinkage of the case 

kubectl edit sts (sts stands for SatefulSet  ) as long as the pod created by the statefulset controller will be displayed

 

 

kubectl get svc #View the created headless service myapp-svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 50d
myapp-svc ClusterIP None <none> 80/TCP 38s

kubectl get sts    #查看statefulset
NAME      DESIRED   CURRENT   AGE
myapp     3         3         55s

kubectl get pvc # viewpvc bindingNAME
STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
myappdata-myapp-0 Bound pv002 2Gi RWO 1m
myappdata-myapp-1 Bound pv003 2Gi RWO,RWX 1m
myappdata-myapp-2 Bound pv004 2Gi RWO,RWX 1m

kubectl get pv    #查看pv绑定
NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                       STORAGECLASS   REASON    AGE
pv001     1Gi        RWO,RWX        Retain           Available                                                        6m
pv002     2Gi        RWO            Retain           Bound       default/myappdata-myapp-0                            6m
pv003     2Gi        RWO,RWX        Retain           Bound       default/myappdata-myapp-1                            6m
pv004     2Gi        RWO,RWX        Retain           Bound       default/myappdata-myapp-2                            6m
pv005     2Gi        RWO,RWX        Retain           Available                                                        6m

// When a StatefulSet is deleted, the StatefulSet does not provide any guarantee of terminating the Pod. To achieve an orderly and graceful termination of Pods in a StatefulSet, the StatefulSet can be scaled down to 0 before deletion.
kubectl scale statefulset myappdata-myapp --replicas=0
kubectl delete -f stateful-demo.yaml

//The PVC still exists at this time, and when the pod is recreated, the original pvc will still be bound again
kubectl apply -f stateful-demo.yaml

//Summary
stateless:
1) Deployment thinks all pods are the same
2) No need to consider the order requirements
3) No need to consider which node node to run on
4) You can expand and shrink at will 

Stateful
1) There are differences between instances, each instance has its own uniqueness and different metadata, such as etcd, zookeeper
2) The unequal relationship between instances, and applications that rely on external storage.

 

The difference between regular service and headless service
is service: a set of Pod access policies, providing communication between cluster-IP clusters, and also providing load balancing and service discovery.
Headless service: headless service, does not require cluster-IP, but directly resolves the IP address of the proxied Pod in the form of DNS records.

 

(1) Why should there be headless?


In a deployment, each pod has no name, is a random string, and is out of order. In statefulset, order is required, and the name of each pod must be fixed. When the node hangs, the identifier after rebuilding remains unchanged, and the node name of each node cannot be changed. The pod name is a unique identifier for pod identification, and its identifier must be stable and unique.

In order to achieve the stability of the identifier, a headless service is required to resolve directly to the pod at this time, and a unique name needs to be configured for the pod.

(2) Why is there a volumeClaimTemplate?


 Most stateful replica sets use persistent storage. For example, in a distributed system, each node needs its own dedicated storage node because the data is different. However, the storage volume created in the pod template in the deployment is a shared storage volume. Multiple pods use the same storage volume, but each pod in the statefulset definition cannot use the same storage volume. Therefore, it is not suitable to create pods based on the pod template. This requires the introduction of volumeClaimTemplate. When creating pods using statefulset, a PVC will be automatically generated, thereby requesting to bind a PV, so as to have its own dedicated storage volume.

Service discovery: It is the process of locating each other between application services.
Application scenarios:
●Strong dynamics: Pods will float to other nodes
●Frequent updates and releases: Internet thinking runs in small steps, implement first and then optimize, the boss will always go online first and then slowly optimize, first turn the idea into a product to earn money, and then slowly optimize it bit by bit ●Support automatic scaling: once there is a big promotion, it must expand multiple
copies

The service discovery method in K8S --- DNS, enables the K8S cluster to automatically associate the "name" and "CLUSTER-IP" of the Service resource, so as to achieve the purpose of automatic discovery of the service by the cluster.
 

 (3) Summary of StatefulSet control


1、部署有状态应用的   
2、每个Pod的名称是唯一且固定不变的,而且每个Pod应该拥有自己专属的持久化存储(基于PVC模板volumeClaimTemplates绑定PV)
3、需要关联 Headless Service(ClusterIP为None),在K8S集群内部可通过 <pod_name>.<svc.name>.<namespace_name>.svc.cluster.local 的格式解析出 PodIP (基于无头服务和CoreDNS实现)
4、创建、删除、升级、扩缩容Pod都是有序进行的(默认为串行执行的):
    创建、升级,扩容是升序执行的(顺序为Pod标识序号0..n-1),删除是逆序执行的(顺序为 n-1..0)
  缩容和回滚都是逆序执行的(顺序为 n-1..0),会先删除旧Pod,再创建新Pod 

 

3. DaemonSet controller

 3.1 Application of DaemonSet controller


DaemonSet ensures that all (or some) Nodes run a copy of a Pod. When a Node joins the cluster, a Pod is also added for them. These Pods are also recycled when a Node is removed from the cluster. Deleting a DaemonSet will delete all Pods it created.

Some typical usages of using DaemonSet:
● Running cluster storage daemon, such as running glusterd and ceph on each Node.
●Run log collection daemon on each Node, such as fluentd, logstash.
● Run a monitoring daemon on each Node, such as Prometheus Node Exporter, collectd, Datadog agent, New Relic agent, or Ganglia gmond.
Application scenario: Agent
official case (monitoring):

 

3.2 Case demonstration of DaemonSet controller

vim ds.yaml 
apiVersion: apps/v1
kind: DaemonSet 
metadata:
  name: nginx-daemonSet
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14
        ports:
        - containerPort: 80
 
 
kubectl apply -f ds.yaml

 The daemon controller will create an identical pod on each node node

 DaemonSet
1. In theory, Pod resources of the same type can be created on all Node nodes of the K8S cluster (no matter what the Node node is added to the K8S cluster). 2. It
will be affected by the stain on the Node node or the unschedulable setting of cordon. You can set tolerance to ignore stains in the Pod configuration, set uncordon to remove unschedulable
3, no need to set the number of replicas

 

 4. Job controller

4.1 Application of job controller

 Jobs are divided into common tasks (Job) and scheduled tasks (CronJob),
which are often used to run tasks that only need to be executed once.
Application scenarios: database migration, batch scripts, kube-bench scanning, offline data processing, video decoding, etc.

4.2 Case demonstration of job controller 

vim job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

Parameter explanation:
.spec.template.spec.restartPolicy This attribute has three candidate values: OnFailure, Never and Always. The default value is Always. It is mainly used to describe the restart policy of the container in the pod. This property can only be set to OnFailure or Never in the Job, otherwise the Job will run uninterrupted.

.spec.backoffLimit is used to set the number of retries after the job fails, and the default value is 6. By default, unless the Pod fails or the container exits abnormally, the Job task will be retried without interruption. At this time, the Job follows the above description of .spec.backoffLimit. Once .spec.backoffLimit is reached, the job will be marked as failed.
 

 5.CronJob 

Periodic tasks, like Linux's Crontab.
Periodic task
application scenarios: notification, backup

Example:
//print hello every minute
vim cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec :
      template :
        spec :
          containers :
          - name: hello
            image: busybox
            imagePullPolicy: IfNotPresent
            arg s:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure //Configuration spec
          
for other available parameters of cronjob :

  concurrencyPolicy: Allow #Describes how to deal with overlapping tasks created by CronJob (concurrency rules only apply to tasks created by the same CronJob). spec can only declare one of the following rules:
                                         ●Allow (default): CronJob allows concurrent task execution.
                                         ●Forbid: CronJob does not allow concurrent task execution; if the execution time of the new task is up but the old task is not completed, CronJob will ignore the execution of the new task.
                                         ●Replace: If the execution time of the new task is up but the old task is not finished, CronJob will replace the currently running task with the new task.
  startingDeadlineSeconds: 15 #It indicates the number of seconds of the deadline to start the task if the task misses the scheduling time for some reason. After the deadline, the CronJob will not start the task and mark it as failed. If this field is not set, the task has no deadline.
  successfulJobsHistoryLimit: 3 #The number of successfully completed tasks to keep (default is 3)
  failedJobsHistoryLimit: 1 #How many completed and failed tasks to keep (default is 1)
  suspend: true #If set to true, subsequent executions will be suspended. This setting has no effect on executions that have already started. The default is false.
  schedule: '*/1 * * * *' #required field, job schedule. In this example, the job will run every minute
  jobTemplate: #required field, job template. This is similar to the working example
 
 
kubectl create -f cronjob.yaml 
 
kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello */1 * * * * False 0 <none> 25s
 
kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-1621587180-mffj6 0/1 Completed 0 3m
hello-1621587240-g68w4 0/1 Completed 0 2m
hello-1621587300-vmkqg 0/1 Completed 0 60s
 
kubectl logs hello-1621587180-mffj6
Fri May 21 09:03:14 UTC 2 021
Hello from the Kubernetes cluster
//If an error is reported: Error from server (Forbidden): Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=proxy) ( pods/log hello-1621587780-c7v54) //Solution: Bind a cluster-admin permission kubectl create clusterrolebinding system:anonymous --clusterrole=clu ster-admin --user=system
:
anonymous

Summarize

How many types of pod controllers are there?





 
 




 
 






You can set tolerance to ignore taints in the Pod configuration, set uncordon to remove unschedulable 3, do not need to set the number of replicas Job 1, deploy resources for one-time tasks 2, after the task is completed normally, the Pod container will exit immediately and will not restart (the restartPolicy of the Job-type Pod container is usually set to Never), and the Pod will not be rebuilt
 

 
 





 

 



 
 




 
 



3. If the task is completed abnormally and the Pod container exits abnormally, the Pod will retry the task. The number of retries is based on the backoffLimit configuration (6 times by default)
 
 
. month
week '

Guess you like

Origin blog.csdn.net/zhangchang3/article/details/131597777