Completely Delete K8s Cluster

Today I spend some time to investigate how to remove nodes from the k8s cluster that built by kubeadm.

For example, I have a 3 nodes cluster called k8stest, I deploy the application in namespace test-1, each worker node (k8stest2 and k8stest3) holds some pods:

kubectl get pods -n test-1 -o wide

NAME                                     READY   STATUS    RESTARTS   AGE     IP            NODE                    NOMINATED NODE   READINESS GATES
is-en-conductor-0                        1/1     Running   0          5h40m   192.168.1.2   k8stest3.fyre.ibm.com   <none>           <none>
is-engine-compute-0                      1/1     Running   0          5h39m   192.168.1.3   k8stest3.fyre.ibm.com   <none>           <none>
is-engine-compute-1                      1/1     Running   0          5h38m   192.168.2.4   k8stest2.fyre.ibm.com   <none>           <none>
is-servicesdocker-pod-7b4d9d5c48-vvfn6   1/1     Running   0          5h41m   192.168.2.3   k8stest2.fyre.ibm.com   <none>           <none>
is-xmetadocker-pod-5ff59fff46-tkmqn      1/1     Running   0          5h42m   192.168.2.2   k8stest2.fyre.ibm.com   <none>           <none>

Drain and delete worker nodes

You can use kubectl drain to safely evict all of your pods from a node before you perform maintenance on the node (e.g. kernel upgrade, hardware maintenance, etc.). Safe evictions allow the pod’s containers to gracefully terminate and will respect the PodDisruptionBudgets you have specified.

The drain evicts or deletes all pods except mirror pods (which cannot be deleted through the API server). If there are DaemonSet-managed pods, drain will not proceed without --ignore-daemonsets, and regardless it will not delete any DaemonSet-managed pods, because those pods would be immediately replaced by the DaemonSet controller, which ignores unschedulable markings. If there are any pods that are neither mirror pods nor managed by ReplicationController, ReplicaSet, DaemonSet, StatefulSet or Job, then drain will not delete any pods unless you use --force. --force will also allow deletion to proceed if the managing resource of one or more pods is missing.

Let's first drain k8stest2:

kubectl drain k8stest2.fyre.ibm.com --delete-local-data --force --ignore-daemonsets

node/k8stest2.fyre.ibm.com cordoned
WARNING: Ignoring DaemonSet-managed pods: calico-node-txjpn, kube-proxy-52njn
pod/is-engine-compute-1 evicted
pod/is-xmetadocker-pod-5ff59fff46-tkmqn evicted
pod/is-servicesdocker-pod-7b4d9d5c48-vvfn6 evicted
node/k8stest2.fyre.ibm.com evicted

When kubectl drain returns successfully, that indicates that all of the pods (except the ones excluded as described in the previous paragraph) have been safely evicted (respecting the desired graceful termination period, and without violating any application-level disruption SLOs). It is then safe to bring down the node by powering down its physical machine or, if running on a cloud platform, deleting its virtual machine.

Let's ssh to k8stest2 node and see what happens here, the payloads were gone:

ssh k8stest2.fyre.ibm.com
docker ps

CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
0fbbb64d93d0        fa6f35a1c14d           "/install-cni.sh"        6 hours ago         Up 6 hours                              k8s_install-cni_calico-node-txjpn_kube-system_4b916269-3d49-11e9-b6b3-00163e01eecc_0
b78013d4f454        427a0694c75c           "start_runit"            6 hours ago         Up 6 hours                              k8s_calico-node_calico-node-txjpn_kube-system_4b916269-3d49-11e9-b6b3-00163e01eecc_0
c6aaf7cbf713        01cfa56edcfc           "/usr/local/bin/kube..."   6 hours ago         Up 6 hours                              k8s_kube-proxy_kube-proxy-52njn_kube-system_4b944a11-3d49-11e9-b6b3-00163e01eecc_0
542bc4662ee4        k8s.gcr.io/pause:3.1   "/pause"                 6 hours ago         Up 6 hours                              k8s_POD_calico-node-txjpn_kube-system_4b916269-3d49-11e9-b6b3-00163e01eecc_0
86ee508f0aa1        k8s.gcr.io/pause:3.1   "/pause"                 6 hours ago         Up 6 hours                              k8s_POD_kube-proxy-52njn_kube-system_4b944a11-3d49-11e9-b6b3-00163e01eecc_0

The given node will be marked unschedulable to prevent new pods from arriving.

kubectl get nodes

NAME                    STATUS                     ROLES    AGE     VERSION
k8stest1.fyre.ibm.com   Ready                      master   6h11m   v1.13.2
k8stest2.fyre.ibm.com   Ready,SchedulingDisabled   <none>   5h57m   v1.13.2
k8stest3.fyre.ibm.com   Ready                      <none>   5h57m   v1.13.2

Because the dedicated node k8stest2 was drained, so is-servicesdocker and is-xmetadocker keep pending:

NAME                                     READY   STATUS    RESTARTS   AGE     IP            NODE                    NOMINATED NODE   READINESS GATES
is-en-conductor-0                        1/1     Running   0          6h3m    192.168.1.2   k8stest3.fyre.ibm.com   <none>           <none>
is-engine-compute-0                      1/1     Running   0          6h2m    192.168.1.3   k8stest3.fyre.ibm.com   <none>           <none>
is-engine-compute-1                      1/1     Running   0          9m26s   192.168.1.4   k8stest3.fyre.ibm.com   <none>           <none>
is-servicesdocker-pod-7b4d9d5c48-vz7x4   0/1     Pending   0          9m39s   <none>        <none>                  <none>           <none>
is-xmetadocker-pod-5ff59fff46-m4xj2      0/1     Pending   0          9m39s   <none>        <none>                  <none>           <none>

Now it's safe to delete node:

kubectl delete node k8stest2.fyre.ibm.com

node "k8stest2.fyre.ibm.com" deleted

kubectl get nodes

NAME                    STATUS   ROLES    AGE     VERSION
k8stest1.fyre.ibm.com   Ready    master   6h22m   v1.13.2
k8stest3.fyre.ibm.com   Ready    <none>   6h8m    v1.13.2

Repeat the steps above for worker node k8stest3 then only master node survives:

kubectl get nodes

NAME                    STATUS   ROLES    AGE     VERSION
k8stest1.fyre.ibm.com   Ready    master   6h25m   v1.13.2

Drain master node

It's time to deal with master node:

kubectl drain k8stest1.fyre.ibm.com --delete-local-data --force --ignore-daemonsets

node/k8stest1.fyre.ibm.com cordoned
WARNING: Ignoring DaemonSet-managed pods: calico-node-vlqh5, kube-proxy-5tfgr
pod/docker-registry-85577757d5-952wq evicted
pod/coredns-86c58d9df4-kwjr8 evicted
pod/coredns-86c58d9df4-4p7g2 evicted
node/k8stest1.fyre.ibm.com evicted

Let's see what happens for infrastructure pods, some of them were gone:

kubectl get pods -n kube-system

NAME                                            READY   STATUS    RESTARTS   AGE
calico-node-vlqh5                               2/2     Running   0          6h31m
coredns-86c58d9df4-5ctw2                        0/1     Pending   0          2m15s
coredns-86c58d9df4-mg8rf                        0/1     Pending   0          2m15s
etcd-k8stest1.fyre.ibm.com                      1/1     Running   0          6h31m
kube-apiserver-k8stest1.fyre.ibm.com            1/1     Running   0          6h31m
kube-controller-manager-k8stest1.fyre.ibm.com   1/1     Running   0          6h30m
kube-proxy-5tfgr                                1/1     Running   0          6h31m
kube-scheduler-k8stest1.fyre.ibm.com            1/1     Running   0          6h31m

Note that don't do delete for master node.

Reset cluster

Run this in every node to revert any changes made by kubeadm init or kubeadm join:

kubeadm reset -f

All container were gone and also check if kubectl still works?

docker ps

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

kubectl get nodes

The connection to the server 9.30.219.224:6443 was refused - did you specify the right host or port?

Delete rpm and files

Finally, we need to delete rpms and remove residue:

yum erase -y kubeadm.x86_64 kubectl.x86_64 kubelet.x86_64 kubernetes-cni.x86_64 cri-tools socat

## calico
/bin/rm -rf /opt/cni/bin/*
/bin/rm -rf /var/lib/calico
/bin/rm -rf /run/calico
## config
/bin/rm -rf /root/.kube
## etcd
/bin/rm -rf /var/lib/etcd/*
## kubernetes
/bin/rm -rf /etc/kubernetes/