This article is based on k8s-v1.18.0 version, refer to https://cloud.tencent.com/developer/article/1552452.
1. Environmental description
CPU name | ip | system version | docker version |
---|---|---|---|
master | 192.168.148.124 | CentOS 7.6.1810 | 19.03.9 |
node01 | 192.168.148.125 | CentOS 7.6.1810 | 19.03.9 |
node02 | 192.168.148.126 | CentOS 7.6.1810 | 19.03.9 |
Two, background
When the node node performs operations such as patching and operating system upgrades, it needs to be shut down for maintenance, which involves pod eviction and migration. This article will introduce the entire process of node node maintenance in detail.
Three, pdb introduction
- pdb is the abbreviation of poddisruptionbudgets, which means active eviction protection; without pdb, when node maintenance is performed, if multiple pods of a certain service are on the node, the downtime of the node may cause service interruption or service degradation. For example, a service has 5 pods, and the minimum 3 pods can guarantee the quality of service, otherwise it will cause slow response and other effects. At this time, the 4 pods of the service are on node01. If node01 is shut down for maintenance, only One pod can serve externally normally, and the normal response of the service will be affected during the migration process of the 4 pods of node01.
- pdb can ensure that no less than a certain number of pods are running during node maintenance, thereby maintaining the quality of service.
Ready to work
1. Create a new pod
cd /root/pkg/eg/
kubectl create deployment nginx1.18 --image=nginx:1.18 --dry-run -o yaml > nginx.yml
cat nginx.ym
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx1.18
name: nginx1.18
spec:
replicas: 5
selector:
matchLabels:
app: nginx1.18
strategy: {
}
template:
metadata:
labels:
app: nginx1.18
spec:
containers:
- image: nginx:1.18
name: nginx
kubectl apply -f nginx.yml
Five pods, distributed on node01 and node02
2. Create a new pdb
cd /root/pkg/eg
cat pdb-nginx.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: pdb-nginx
spec:
minAvailable: 4
selector:
matchLabels:
app: nginx1.18
Create a new pdb-nginx.yml, Label Selector and deployment are both app: nginx1.18, minAvailable: 4 means that there are at least 4 surviving nginx pods.
Four, node maintenance
This article introduces node node02 maintenance as an example
1. Set the node to be unschedulable
kubectl cordon node02
Set node02 to be unschedulable. Check the status of each node and find that node02 is SchedulingDisabled
. At this time, the master will not schedule a new pod to this node, but the pod on node02 is still running normally.
2. Evict pods on the node
kubectl drain node02 --delete-local-data --ignore-daemonsets --force
Parameter Description:
- --Delete-local-data: delete even if the pod uses emptyDir;
- --Ignore-daemonsets: Ignore the pod of the deamonset controller. If it is not ignored, the pod controlled by the deamonset controller may be restarted on this node immediately after being deleted, which will become an endless loop;
- --Force: Without the force parameter, it will only delete the Pod created by ReplicationController, ReplicaSet, DaemonSet, StatefulSet or Job on the NODE. After adding it, it will also delete the "streaking pod" (not bound to any replication controller)
It can be seen that only one pod is migrated at the same time, and there are always 4 pods that provide external services. This again verifies that only one pod is migrated at the same time, and the nginx service always has 4 pods that provide external services.
3. End of maintenance
kubectl uncordon node02
After the maintenance is over, set the node02 node to a schedulable state again.
Five, pod relocation
There seems to be no good way to relocate pod, here we use delete and rebuild to relocate.
kubectl get po -o wide
kubectl delete pod nginx1.18-7646b89d65-7klpv nginx1.18-7646b89d65-ddn9p
In the low peak of business, nginx1.18-7646b89d65-7klpv and nginx1.18-7646b89d65-ddn9p, because the pods on node02 have been evicted before, the resource utilization rate is the lowest at this time, so when the pod is rebuilt, the node will be scheduled to complete the pod Move back.
Six, node deletion
1. Delete the node
A certain node node may be deleted during the actual operation and maintenance process. This article still uses node02 as an example to introduce how to delete a node.
kubectl cordon node02
kubectl drain node02 --delete-local-data --ignore-daemonsets --force
kubectl delete node node02
kubeadm reset
2. Rejoin the node
Run on the master node
kubeadm token create --print-join-command
run on node02
modprobe br_netfilter
kubeadm join 192.168.148.124:6443 --token 4r4c2u.29r7tm05i6h9nfyv --discovery-token-ca-cert-hash sha256:af6e4d737cbd7e294036d7391a5931fba589942e777811bb6f74b77ccbda3cfc #node02上运行
View node
kubectl get node