KubeKey Practical Guide to Upgrading Kubernetes Minor Versions

Author: Operation and Maintenance Skills

Preface

Knowledge points

  • Rating: entry level
  • How KubeKey upgrades Kubernetes minor versions
  • Kubernetes upgrade preparation and verification
  • Frequently Asked Questions about KubeKey Upgrading Kubernetes

Actual server configuration (architecture 1:1 replica of a small-scale production environment, configuration is slightly different)

CPU name IP CPU Memory system disk data disk use
k8s-master-1 192.168.9.91 4 16 40 100 KubeSphere/k8s-master
k8s-master-2 192.168.9.92 4 16 40 100 KubeSphere/k8s-master
k8s-master-3 192.168.9.93 4 16 40 100 KubeSphere/k8s-master
k8s-worker-1 192.168.9.95 8 16 40 100 k8s-worker/CI
k8s-worker-2 192.168.9.96 8 16 40 100 k8s-worker
k8s-worker-3 192.168.9.97 8 16 40 100 k8s-worker
k8s-storage-1 192.168.9.81 4 16 40 100/100/100/100/100 ElasticSearch/GlusterFS/Ceph-Rook/Longhorn/NFS/
k8s-storage-2 192.168.9.82 4 16 40 100/100/100/100 ElasticSearch/GlusterFS/Ceph-Rook/Longhorn/
k8s-storage-3 192.168.9.83 4 16 40 100/100/100/100 ElasticSearch/GlusterFS/Ceph-Rook/Longhorn/
registry 192.168.9.80 4 8 40 100 Sonatype Nexus 3
total 10 52 152 400 2000

The actual combat environment involves software version information

  • Operating system: CentOS 7.9 x86_64
  • KubeSphere: v3.4.1
  • Kubernetes:v1.24.14 to v1.26.5
  • Containerd:1.6.4
  • KubeKey: v3.0.13

1 Introduction

In the last issue, we completed the actual patch version upgrade of KubeSphere and Kubernetes . In this issue, we will learn how to use KubeKey to implement Kubernetes minor version upgrade.

This issue does not cover minor version upgrades of KubeSphere. Readers in need can refer to the official upgrade documentation for the corresponding version .

Regarding upgrading across minor versions, my personal considerations are as follows:

  • Whether it is KubeSphere or Kubernetes, do not upgrade unless necessary (the risk is high and there are too many uncontrollable factors. Even if sufficient verification testing is done in advance, who can guarantee that there will be no accidents in production upgrades?)
  • Kubernetes should not be upgraded in place unless necessary. It is recommended to adopt the solution of "building a new version of the cluster + migrating business applications" or a blue-green upgrade.
  • In-situ upgrades must be limited to two minor versions as much as possible, and sufficient research and verification must be done (such as resource compatibility caused by different versions, different APIs, explosion radius of failed upgrades, etc.)
  • The cross-version upgrade of KubeSphere is more complicated. The more additional plug-ins are enabled, the more components and intermediate prices are involved, and the more verification points need to be considered for the upgrade.

KubeKey supports two upgrade scenarios : All-in-One cluster and multi-node cluster. This article only demonstrates the upgrade scenario of multi-node cluster. For All-in-One cluster, please refer to the official upgrade guide .

The KubeSphere and Kubernetes minor version upgrade process is the same as the patch version upgrade process, so I won’t describe it in detail here. Please see above for details.

2. Upgrade the prerequisites for actual combat

In the last issue, we completed the upgrade from Kubernetes v1.24.12 to v1.24.14. In this issue, we will upgrade Kubernetes v1.24.14 to v1.26.5 based on the same environment.

At the same time, in order to simulate real business scenarios, we continue to create some test resources. Before verification, we also need to record some key information about the current cluster.

2.1 Cluster environment

  • KubeSphere v3.4.1, with most plugins enabled
  • Install v1.24.14 Kubernetes cluster
  • Connect to NFS storage or other storage as persistent storage (this article uses NFS for the test environment)

2.2 View current cluster environment information

The current cluster environment information viewed below is not sufficient. It only looks at a few representative resources. There must be components and information that have been ignored.

  • View all node information
[root@k8s-master-1 ~]# kubectl get nodes -o wide
NAME           STATUS   ROLES                  AGE     VERSION    INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-master-1   Ready    control-plane          6d21h   v1.24.14   192.168.9.91   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-2   Ready    control-plane,worker   6d21h   v1.24.14   192.168.9.92   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-3   Ready    control-plane,worker   6d21h   v1.24.14   192.168.9.93   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-1   Ready    worker                 6d21h   v1.24.14   192.168.9.95   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-2   Ready    worker                 6d21h   v1.24.14   192.168.9.96   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-3   Ready    worker                 6d19h   v1.24.14   192.168.9.97   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
  • View the images used by all Deployments to facilitate comparison after upgrade ( only for records, of little practical significance, the results do not include Kubernetes core components )
# 受限于篇幅,输出结果略,请自己保存结果
kubectl get deploy -A -o wide
  • View Kubernetes resources ( limited by space, pod results are not shown )
[root@k8s-master-1 ~]# kubectl get pods,deployment,sts,ds -o wide -n kube-system
NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS                     IMAGES                                                                    SELECTOR
deployment.apps/calico-kube-controllers       1/1     1            1           6d21h   calico-kube-controllers        registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controllers:v3.26.1    k8s-app=calico-kube-controllers
deployment.apps/coredns                       2/2     2            2           6d21h   coredns                        registry.cn-beijing.aliyuncs.com/kubesphereio/coredns:1.8.6               k8s-app=kube-dns
deployment.apps/metrics-server                1/1     1            1           6d21h   metrics-server                 registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2       k8s-app=metrics-server
deployment.apps/openebs-localpv-provisioner   1/1     1            1           6d21h   openebs-provisioner-hostpath   registry.cn-beijing.aliyuncs.com/kubesphereio/provisioner-localpv:3.3.0   name=openebs-localpv-provisioner,openebs.io/component-name=openebs-localpv-provisioner

NAME                                   READY   AGE     CONTAINERS            IMAGES
statefulset.apps/snapshot-controller   1/1     6d21h   snapshot-controller   registry.cn-beijing.aliyuncs.com/kubesphereio/snapshot-controller:v4.0.0

NAME                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE     CONTAINERS    IMAGES                                                                     SELECTOR
daemonset.apps/calico-node    6         6         6       6            6           kubernetes.io/os=linux   6d21h   calico-node   registry.cn-beijing.aliyuncs.com/kubesphereio/node:v3.26.1                 k8s-app=calico-node
daemonset.apps/kube-proxy     6         6         6       6            6           kubernetes.io/os=linux   6d21h   kube-proxy    registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.24.14          k8s-app=kube-proxy
daemonset.apps/nodelocaldns   6         6         6       6            6           <none>                   6d21h   node-cache    registry.cn-beijing.aliyuncs.com/kubesphereio/k8s-dns-node-cache:1.15.12   k8s-app=nodelocaldns
  • View the Image used by the current Master and Worker nodes
# Master
[root@k8s-master-1 ~]#  crictl images | grep v1.24.14
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver            v1.24.14            b651b48a617a5       34.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager   v1.24.14            d40212fa9cf04       31.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                v1.24.14            e57c0d007d1ef       39.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler            v1.24.14            19bf7b80c50e5       15.8MB

# Worker
[root@k8s-worker-1 ~]# crictl images | grep v1.24.14
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                      v1.24.14                       e57c0d007d1ef       39.7MB
  • Check the Kubernetes core component binary files ( record and compare whether there are any upgrades )
[root@k8s-master-1 ~]# ll /usr/local/bin/
total 352448
-rwxr-xr-x 1 root root  65770992 Dec  4 15:09 calicoctl
-rwxr-xr-x 1 root root  23847904 Nov 29 13:50 etcd
-rwxr-xr-x 1 kube root  17620576 Nov 29 13:50 etcdctl
-rwxr-xr-x 1 root root  46182400 Dec  4 15:09 helm
-rwxr-xr-x 1 root root  44748800 Dec  4 15:09 kubeadm
-rwxr-xr-x 1 root root  46080000 Dec  4 15:09 kubectl
-rwxr-xr-x 1 root root 116646168 Dec  4 15:09 kubelet
drwxr-xr-x 2 kube root        71 Nov 29 13:51 kube-scripts

2.3 Create test verification resources

  • Create test namespaceupgrade-test
kubectl create ns upgrade-test
  • Create a test resource file ( use StatefulSet to quickly create the corresponding test volume ), use the image nginx:latest to create a 6-copy test business (including pvc), and each copy is distributed on a Worker node.vi nginx-test.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-nginx
  namespace: upgrade-test
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx"
  replicas: 6
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: nfs-volume
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: nfs-volume
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "nfs-sc"
      resources:
        requests:
          storage: 1Gi
  • Create test resources
kubectl apply -f nginx-test.yaml
  • Write index home page file
for id in $(seq 0 1 5);do kubectl exec -it test-nginx-$id -n upgrade-test  -- sh -c "echo I test-nginx-$id > /usr/share/nginx/html/index.html";done
  • View testing resources
# 查看 Pod(每个节点一个副本)
[root@k8s-master-1 ~]# kubectl get pods -o wide -n upgrade-test
NAME           READY   STATUS    RESTARTS   AGE     IP              NODE           NOMINATED NODE   READINESS GATES
test-nginx-0   1/1     Running   0          8m32s   10.233.80.77    k8s-master-1   <none>           <none>
test-nginx-1   1/1     Running   0          8m10s   10.233.96.33    k8s-master-3   <none>           <none>
test-nginx-2   1/1     Running   0          7m47s   10.233.85.99    k8s-master-2   <none>           <none>
test-nginx-3   1/1     Running   0          7m25s   10.233.87.104   k8s-worker-3   <none>           <none>
test-nginx-4   1/1     Running   0          7m3s    10.233.88.180   k8s-worker-1   <none>           <none>
test-nginx-5   1/1     Running   0          6m41s   10.233.74.129   k8s-worker-2   <none>           <none>

# 查看 PVC
[root@k8s-master-1 ~]# kubectl get pvc -o wide -n upgrade-test
NAME                      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE     VOLUMEMODE
nfs-volume-test-nginx-0   Bound    pvc-e7875a9c-2736-4b32-aa29-b338959bb133   1Gi        RWO            nfs-sc         8m52s   Filesystem
nfs-volume-test-nginx-1   Bound    pvc-8bd760a8-64f9-40e9-947f-917b6308c146   1Gi        RWO            nfs-sc         8m30s   Filesystem
nfs-volume-test-nginx-2   Bound    pvc-afb47509-c249-4892-91ad-da0f69e33495   1Gi        RWO            nfs-sc         8m7s    Filesystem
nfs-volume-test-nginx-3   Bound    pvc-e6cf5935-2852-4ef6-af3d-a06458bceb49   1Gi        RWO            nfs-sc         7m45s   Filesystem
nfs-volume-test-nginx-4   Bound    pvc-7d176238-8106-4cc9-b49d-fceb25242aee   1Gi        RWO            nfs-sc         7m23s   Filesystem
nfs-volume-test-nginx-5   Bound    pvc-dc35e933-0b3c-4fb9-9736-78528d26ce6f   1Gi        RWO            nfs-sc         7m1s    Filesystem

# 查看 index.html
[root@k8s-master-1 ~]# for id in $(seq 0 1 5);do kubectl exec -it test-nginx-$id -n upgrade-test  -- cat /usr/share/nginx/html/index.html;done
I test-nginx-0
I test-nginx-1
I test-nginx-2
I test-nginx-3
I test-nginx-4
I test-nginx-5

2.4 Observe cluster and business status during upgrade

The observations in this article are not necessarily comprehensive and sufficient, so you need to supplement them based on the real environment when performing actual upgrade verification tests. It is not convenient to take screenshots of the observation process, so readers are asked to observe by themselves.

  • Observe cluster node status
 watch kubectl get nodes
  • Observe the resource status of the test namespace (focus on whether RESTARTS changes)
watch kubectl get pods -o wide -n upgrade-test
  • Ping the simulated business IP (randomly find a Pod)
ping 10.233.80.77
  • Business IP simulated by curl (randomly find a Pod, different from the one tested by Ping)
watch curl 10.233.88.180
  • Observe the simulated business disk mounting situation (writing is not verified, focus on observing whether the output results have nfs storage)
watch kubectl exec -it test-nginx-3 -n upgrade-test  -- df -h

3. Download KubeKey

Before upgrading the cluster, execute the following command to download the latest version or the specified version of KubeKey.

export KKZONE=cn
curl -sfL https://get-kk.kubesphere.io | VERSION=v3.0.13 sh -

4. Generate cluster deployment configuration file

4.1 Use KubeKey to generate configuration files

Before upgrading, you need to prepare cluster deployment files. First , it is recommended to use KubeKey to deploy the configuration files used when deploying KubeSphere and Kubernetes clusters.

If the configuration used during deployment is lost, you can execute the following command to create a sample.yamlconfiguration file based on the existing cluster ( the focus of this article is demonstrated ).

./kk create config --from-cluster

Remark :

This article assumes that kubeconfig is located at ~/.kube/config. You can --kubeconfigmodify this via the flag.

The actual command execution results are as follows:

[root@k8s-master-1 kubekey]# ./kk create config --from-cluster
Notice: /root/kubekey/sample.yaml has been created. Some parameters need to be filled in by yourself, please complete it.

Generated configuration filesample.yaml

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  ##You should complete the ssh information of the hosts
  - {name: k8s-master-1, address: 192.168.9.91, internalAddress: 192.168.9.91}
  - {name: k8s-master-2, address: 192.168.9.92, internalAddress: 192.168.9.92}
  - {name: k8s-master-3, address: 192.168.9.93, internalAddress: 192.168.9.93}
  - {name: k8s-worker-1, address: 192.168.9.95, internalAddress: 192.168.9.95}
  - {name: k8s-worker-2, address: 192.168.9.96, internalAddress: 192.168.9.96}
  - {name: k8s-worker-3, address: 192.168.9.97, internalAddress: 192.168.9.97}
  roleGroups:
    etcd:
    - SHOULD_BE_REPLACED
    master:
    worker:
    - k8s-master-1
    - k8s-master-2
    - k8s-master-3
    - k8s-worker-1
    - k8s-worker-2
    - k8s-worker-3
  controlPlaneEndpoint:
    ##Internal loadbalancer for apiservers
    #internalLoadbalancer: haproxy

    ##If the external loadbalancer was used, 'address' should be set to loadbalancer's ip.
    domain: lb.opsman.top
    address: ""
    port: 6443
  kubernetes:
    version: v1.24.14
    clusterName: opsman.top
    proxyMode: ipvs
    masqueradeAll: false
    maxPods: 110
    nodeCidrMaskSize: 24
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
  registry:
    privateRegistry: ""

4.2 Modify configuration file template

Modify the file according to the actual cluster configuration sample.yaml. Please make sure to modify the following fields correctly.

  • hosts: Basic information about your host (host name and IP address) and information about using SSH to connect to the host ( key modifications, you need to add the SSH username and password ).
  • roleGroups.etcd: etcd node ( key modifications ).
  • roleGroups.master: master node ( key modification, not generated by default, must be added manually otherwise an error will be reported, see FAQ 1 ), note: the name of this parameter field in the configuration file generated during deployment is roleGroups.control-plane.
  • roleGroups.worker:worker node ( check modification ).
  • controlPlaneEndpoint: Load balancer information ( optional ).
  • kubernetes.containerManager: Modify the container runtime ( required, not generated by default, must be added manually otherwise an error will be reported, see FAQ 2 )
  • registry: Mirror service information ( optional ).

Modified file content:

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  ##You should complete the ssh information of the hosts
  - {name: k8s-master-1, address: 192.168.9.91, internalAddress: 192.168.9.91, user: root, password: "P@88w0rd"}
  - {name: k8s-master-2, address: 192.168.9.92, internalAddress: 192.168.9.92, user: root, password: "P@88w0rd"}
  - {name: k8s-master-3, address: 192.168.9.93, internalAddress: 192.168.9.93, user: root, password: "P@88w0rd"}
  - {name: k8s-worker-1, address: 192.168.9.95, internalAddress: 192.168.9.95, user: root, password: "P@88w0rd"}
  - {name: k8s-worker-2, address: 192.168.9.96, internalAddress: 192.168.9.96, user: root, password: "P@88w0rd"}
  - {name: k8s-worker-3, address: 192.168.9.97, internalAddress: 192.168.9.97, user: root, password: "P@88w0rd"}
  roleGroups:
    etcd:
    - k8s-master-1
    - k8s-master-2
    - k8s-master-3
    master:
    - k8s-master-1
    - k8s-master-2
    - k8s-master-3
    worker:
    - k8s-master-1
    - k8s-master-2
    - k8s-master-3
    - k8s-worker-1
    - k8s-worker-2
    - k8s-worker-3
  controlPlaneEndpoint:
    ##Internal loadbalancer for apiservers
    internalLoadbalancer: haproxy

    ##If the external loadbalancer was used, 'address' should be set to loadbalancer's ip.
    domain: lb.opsman.top
    address: ""
    port: 6443
  kubernetes:
    version: v1.24.12
    clusterName: opsman.top
    proxyMode: ipvs
    masqueradeAll: false
    maxPods: 110
    nodeCidrMaskSize: 24
    containerManager: containerd
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
  registry:
    privateRegistry: ""

5. Upgrade Kubernetes

5.1 Upgrade Kubernetes

Execute the following command to upgrade Kubernetes from v1.24.14 to v1.26.5 .

export KKZONE=cn
./kk upgrade --with-kubernetes v1.26.5 -f sample.yaml

The results after execution are as follows ( enter as prompted yesto continue ):

[root@k8s-master-1 kubekey]# ./kk upgrade --with-kubernetes v1.26.5 -f sample.yaml


 _   __      _          _   __
| | / /     | |        | | / /
| |/ / _   _| |__   ___| |/ /  ___ _   _
|    \| | | | '_ \ / _ \    \ / _ \ | | |
| |\  \ |_| | |_) |  __/ |\  \  __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
                                    __/ |
                                   |___/

09:26:43 CST [GreetingsModule] Greetings
09:26:43 CST message: [k8s-worker-3]
Greetings, KubeKey!
09:26:44 CST message: [k8s-master-3]
Greetings, KubeKey!
09:26:44 CST message: [k8s-master-1]
Greetings, KubeKey!
09:26:44 CST message: [k8s-worker-1]
Greetings, KubeKey!
09:26:44 CST message: [k8s-master-2]
Greetings, KubeKey!
09:26:44 CST message: [k8s-worker-2]
Greetings, KubeKey!
09:26:44 CST success: [k8s-worker-3]
09:26:44 CST success: [k8s-master-3]
09:26:44 CST success: [k8s-master-1]
09:26:44 CST success: [k8s-worker-1]
09:26:44 CST success: [k8s-master-2]
09:26:44 CST success: [k8s-worker-2]
09:26:44 CST [NodePreCheckModule] A pre-check on nodes
09:26:45 CST success: [k8s-worker-2]
09:26:45 CST success: [k8s-worker-3]
09:26:45 CST success: [k8s-master-2]
09:26:45 CST success: [k8s-worker-1]
09:26:45 CST success: [k8s-master-3]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [ClusterPreCheckModule] Get KubeConfig file
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [ClusterPreCheckModule] Get all nodes Kubernetes version
09:26:45 CST success: [k8s-worker-2]
09:26:45 CST success: [k8s-worker-3]
09:26:45 CST success: [k8s-worker-1]
09:26:45 CST success: [k8s-master-2]
09:26:45 CST success: [k8s-master-3]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [ClusterPreCheckModule] Calculate min Kubernetes version
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST [ClusterPreCheckModule] Check desired Kubernetes version
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST [ClusterPreCheckModule] Check KubeSphere version
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [ClusterPreCheckModule] Check dependency matrix for KubeSphere and Kubernetes
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [ClusterPreCheckModule] Get kubernetes nodes status
09:26:45 CST skipped: [k8s-master-3]
09:26:45 CST skipped: [k8s-master-2]
09:26:45 CST success: [k8s-master-1]
09:26:45 CST [UpgradeConfirmModule] Display confirmation form
+--------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
| name         | sudo | curl | openssl | ebtables | socat | ipset | ipvsadm | conntrack | chrony | docker | containerd | nfs client | ceph client | glusterfs client | time         |
+--------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
| k8s-master-1 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
| k8s-master-2 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
| k8s-master-3 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
| k8s-worker-1 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
| k8s-worker-2 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
| k8s-worker-3 | y    | y    | y       | y        | y     | y     | y       | y         | y      |        | v1.6.4     | y          |             |                  | CST 09:26:45 |
+--------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+

Cluster nodes status:
NAME           STATUS   ROLES                  AGE     VERSION    INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-master-1   Ready    control-plane          6d21h   v1.24.14   192.168.9.91   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-2   Ready    control-plane,worker   6d21h   v1.24.14   192.168.9.92   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-3   Ready    control-plane,worker   6d21h   v1.24.14   192.168.9.93   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-1   Ready    worker                 6d21h   v1.24.14   192.168.9.95   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-2   Ready    worker                 6d21h   v1.24.14   192.168.9.96   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-3   Ready    worker                 6d19h   v1.24.14   192.168.9.97   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4

Upgrade Confirmation:
kubernetes version: v1.24.14 to v1.26.5

Continue upgrading cluster? [yes/no]:

Note: The Upgrade information is confirmed, and Kubernetes prompts to upgrade from v1.24.14 to v1.26.5.

After clicking "yes", the final execution result of the abridged version is as follows:

09:47:18 CST [ProgressiveUpgradeModule 2/2] Set current k8s version
09:47:18 CST skipped: [LocalHost]
09:47:18 CST [ChownModule] Chown user $HOME/.kube dir
09:47:19 CST success: [k8s-worker-3]
09:47:19 CST success: [k8s-worker-1]
09:47:19 CST success: [k8s-worker-2]
09:47:19 CST success: [k8s-master-1]
09:47:19 CST success: [k8s-master-2]
09:47:19 CST success: [k8s-master-3]
09:47:19 CST Pipeline[UpgradeClusterPipeline] execute successfully

Note: Due to space limitations, I cannot paste the complete execution results. However, I strongly recommend that readers save and carefully analyze the upgrade process log in order to have a deeper understanding of the Kubekey upgrade process for Kubernetes. These log files provide important information about events that occurred during the upgrade process and can help you better understand the entire upgrade process and problems you may encounter, and troubleshoot if needed.

5.2 Observation instructions for the upgrade process

When observing task output results during the upgrade process, there are a few points to note:

  • The Master and Worker nodes will be upgraded one by one (Master takes about 2 minutes, Worker takes about 1 minute). During the upgrade process, when the Master node executes the kubectl command, the API cannot be connected.
[root@k8s-master-1 ~]# kubectl get nodes
The connection to the server lb.opsman.top:6443 was refused - did you specify the right host or port?

illustrate:

The occurrence of this phenomenon does not mean that the Kubernetes API is not highly available. In fact, it is pseudo-highly available.

Mainly because the built-in load balancing HAProxy deployed by KubeKey only acts on the Worker node, and the Master node will only connect to the local kube-apiserver ( therefore, it also shows that it is better to build a self-built load balancing if possible ).

# Master 节点
[root@k8s-master-1 ~]# ss -ntlup | grep 6443
tcp    LISTEN     0      32768  [::]:6443               [::]:*                   users:(("kube-apiserver",pid=11789,fd=7))


# Worker 节点
[root@k8s-worker-1 ~]# ss -ntlup | grep 6443
tcp    LISTEN     0      4000   127.0.0.1:6443                  *:*                   users:(("haproxy",pid=53535,fd=7))
  • The tested Nginx business service was not interrupted ( no abnormalities were found in ping, curl, and df )
  • Kubernetes core component upgrade sequence: first upgrade to v1.25.10 , then upgrade to v1.26.5 ( in line with the requirements of Kubekey design and Kubernetes minor version upgrade differences )
  • The kube-apiserver, kube-controller-manager, kube-proxy, and kube-scheduler images were upgraded from v1.24.14 to v1.25.10 and then to v1.26.5 .

List of binary software downloaded when upgrading to v1.25.10** (the description in bold has changed)**:

  • kubeadm v1.25.10

  • kubelet v1.25.10

  • kubectl v1.25.10

  • helm v3.9.0

  • cube v1.2.0

  • crictl v1.24.0

  • etcd v3.4.13

  • containerd 1.6.4

  • run v1.1.1

  • calicoctl v3.26.1

List of binary software downloaded when upgrading to v1.26.5** (the description in bold has changed)**:

  • kubeadm v1.26.5

  • kubelet v1.26.5

  • kubectl v1.26.5

  • Nothing else changed

List of images involved in the upgrade process** (the descriptions in bold have changed)**:

  • calico/cni:v3.26.1
  • calico/kube-controllers:v3.26.1
  • calico/node:v3.26.1
  • calico/pod2daemon-flexvol:v3.26.1
  • coredns/coredns:1.9.3
  • kubesphere/k8s-dns-node-cache:1.15.12
  • kubesphere/kube-apiserver:v1.25.10
  • kubesphere/kube-apiserver:v1.26.5
  • kubesphere/kube-controller-manager:v1.25.10
  • kubesphere/kube-controller-manager:v1.26.5
  • kubesphere/kube-proxy:v1.25.10
  • kubesphere/kube-proxy:v1.26.5
  • kubesphere/kube-scheduler:v1.25.10
  • kubesphere/kube-scheduler:v1.26.5
  • kubesphere/pause:3.8
  • library/haproxy:2.3

List of core images of all nodes after upgrade (verify the order of upgraded versions):

# Master 命令有筛选,去掉了 v1.24.12 和重复的 docker.io 开通的镜像
[root@k8s-master-1 ~]# crictl images | grep v1.2[4-6] | grep -v "24.12"| grep -v docker.io | sort
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver            v1.24.14            b651b48a617a5       34.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver            v1.25.10            4aafc4b1604b9       34.4MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver            v1.26.5             25c2ecde661fc       35.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager   v1.24.14            d40212fa9cf04       31.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager   v1.25.10            e446ea5ea9b1b       31.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager   v1.26.5             a7403c147a516       32.4MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                v1.24.14            e57c0d007d1ef       39.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                v1.25.10            0cb798db55ff2       20.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                v1.26.5             08440588500d7       21.5MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler            v1.24.14            19bf7b80c50e5       15.8MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler            v1.25.10            de3c37c13188f       16MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler            v1.26.5             200132c1d91ab       17.7MB

# Worker
[root@k8s-worker-1 ~]# crictl images | grep v1.2[4-6] | grep -v "24.12"| grep -v docker.io | sort
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                      v1.24.14                       e57c0d007d1ef       39.7MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                      v1.25.10                       0cb798db55ff2       20.3MB
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy                      v1.26.5                        08440588500d7       21.5MB

5.3 Verification after Kubernetes upgrade

  • View Nodes version ( VERSION updated to v1.26.5 )
[root@k8s-master-1 ~]# kubectl get nodes -o wide
NAME           STATUS   ROLES                  AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-master-1   Ready    control-plane          6d22h   v1.26.5   192.168.9.91   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-2   Ready    control-plane,worker   6d22h   v1.26.5   192.168.9.92   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-master-3   Ready    control-plane,worker   6d22h   v1.26.5   192.168.9.93   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-1   Ready    worker                 6d22h   v1.26.5   192.168.9.95   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-2   Ready    worker                 6d22h   v1.26.5   192.168.9.96   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
k8s-worker-3   Ready    worker                 6d20h   v1.26.5   192.168.9.97   <none>        CentOS Linux 7 (Core)   5.4.261-1.el7.elrepo.x86_64   containerd://1.6.4
  • View Kubernetes resources ( due to space limitations, the pod results are not shown, but the actual changes are all on the Pod )
[root@k8s-master-1 kubekey]# kubectl get pod,deployment,sts,ds -o wide -n kube-system
NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS                     IMAGES                                                                    SELECTOR
deployment.apps/calico-kube-controllers       1/1     1            1           6d22h   calico-kube-controllers        registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controllers:v3.26.1    k8s-app=calico-kube-controllers
deployment.apps/coredns                       2/2     2            2           6d22h   coredns                        registry.cn-beijing.aliyuncs.com/kubesphereio/coredns:1.8.6               k8s-app=kube-dns
deployment.apps/metrics-server                1/1     1            1           6d22h   metrics-server                 registry.cn-beijing.aliyuncs.com/kubesphereio/metrics-server:v0.4.2       k8s-app=metrics-server
deployment.apps/openebs-localpv-provisioner   1/1     1            1           6d22h   openebs-provisioner-hostpath   registry.cn-beijing.aliyuncs.com/kubesphereio/provisioner-localpv:3.3.0   name=openebs-localpv-provisioner,openebs.io/component-name=openebs-localpv-provisioner

NAME                                   READY   AGE     CONTAINERS            IMAGES
statefulset.apps/snapshot-controller   1/1     6d22h   snapshot-controller   registry.cn-beijing.aliyuncs.com/kubesphereio/snapshot-controller:v4.0.0

NAME                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE     CONTAINERS    IMAGES                                                             SELECTOR
daemonset.apps/calico-node    6         6         6       6            6           kubernetes.io/os=linux   6d22h   calico-node   registry.cn-beijing.aliyuncs.com/kubesphereio/node:v3.26.1         k8s-app=calico-node
daemonset.apps/kube-proxy     6         6         6       6            6           kubernetes.io/os=linux   6d22h   kube-proxy    registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.26.5   k8s-app=kube-proxy
daemonset.apps/nodelocaldns   6         6         6       6            6           <none>                   6d22h   node-cache    kubesphere/k8s-dns-node-cache:1.15.12                              k8s-app=nodelocaldns
  • View the binary file ( compare the results before the upgrade to verify whether the components have changed )
[root@k8s-master-1 ~]# ll /usr/local/bin/
total 360880
-rwxr-xr-x 1 root root  65770992 Dec  6 09:41 calicoctl
-rwxr-xr-x 1 root root  23847904 Nov 29 13:50 etcd
-rwxr-xr-x 1 kube root  17620576 Nov 29 13:50 etcdctl
-rwxr-xr-x 1 root root  46182400 Dec  6 09:41 helm
-rwxr-xr-x 1 root root  46788608 Dec  6 09:41 kubeadm
-rwxr-xr-x 1 root root  48046080 Dec  6 09:41 kubectl
-rwxr-xr-x 1 root root 121277432 Dec  6 09:41 kubelet
drwxr-xr-x 2 kube root        71 Nov 29 13:51 kube-scripts

Note: Except for etcd and etcdctl, everything else has been updated, indicating that etcd is not within the scope of component updates.

  • Create test resources
kubectl create deployment nginx-upgrade-test --image=nginx:latest --replicas=6 -n upgrade-test

Note: This test is relatively simple, and more sufficient testing is recommended for production environments .

  • View the created test resources
# 查看 Deployment
[root@k8s-master-1 ~]# kubectl get deployment -n upgrade-test -o wide
NAME                 READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES         SELECTOR
nginx-upgrade-test   6/6     6            6           13s   nginx        nginx:latest   app=nginx-upgrade-test

# 查看 Pod
[root@k8s-master-1 ~]# kubectl get deployment,pod -n upgrade-test -o wide
NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES         SELECTOR
deployment.apps/nginx-upgrade-test   6/6     6            6           56s   nginx        nginx:latest   app=nginx-upgrade-test

NAME                                      READY   STATUS    RESTARTS   AGE   IP              NODE           NOMINATED NODE   READINESS GATES
pod/nginx-upgrade-test-7b668447c4-cglmh   1/1     Running   0          56s   10.233.80.85    k8s-master-1   <none>           <none>
pod/nginx-upgrade-test-7b668447c4-cmzqx   1/1     Running   0          56s   10.233.88.181   k8s-worker-1   <none>           <none>
pod/nginx-upgrade-test-7b668447c4-ggnjg   1/1     Running   0          56s   10.233.87.106   k8s-worker-3   <none>           <none>
pod/nginx-upgrade-test-7b668447c4-hh8h4   1/1     Running   0          56s   10.233.96.38    k8s-master-3   <none>           <none>
pod/nginx-upgrade-test-7b668447c4-hs7n7   1/1     Running   0          56s   10.233.74.130   k8s-worker-2   <none>           <none>
pod/nginx-upgrade-test-7b668447c4-s6wps   1/1     Running   0          56s   10.233.85.102   k8s-master-2   <none>           <none>
pod/test-nginx-0                          1/1     Running   0          85m   10.233.80.77    k8s-master-1   <none>           <none>
pod/test-nginx-1                          1/1     Running   0          85m   10.233.96.33    k8s-master-3   <none>           <none>
pod/test-nginx-2                          1/1     Running   0          84m   10.233.85.99    k8s-master-2   <none>           <none>
pod/test-nginx-3                          1/1     Running   0          84m   10.233.87.104   k8s-worker-3   <none>           <none>
pod/test-nginx-4                          1/1     Running   0          83m   10.233.88.180   k8s-worker-1   <none>           <none>
pod/test-nginx-5                          1/1     Running   0          83m   10.233.74.129   k8s-worker-2   <none>           <none>
  • Rebuild existing Pods ( not verified, production environment upgrade must be verified to prevent cross-version compatibility issues )
  • Check the cluster status in the KubeSphere management console

ksp-v124-to-v126-clusters-components

ksp-v124-to-v126-clusters-monitor-cluster-overview

After a series of operations, we successfully used KubeKey to complete the upgrade and test verification of the Kubernetes minor version. In this process, we have gone through many key links, including actual combat environment preparation, upgrade implementation, and testing and verification. In the end, we achieved the goal of version upgrade and verified that the upgraded system could ( basically ) operate normally.

6. FAQ

Question 1

  • Error message
[root@k8s-master-1 kubekey]# ./kk upgrade --with-kubesphere v3.4.1 -f sample.yaml
14:00:54 CST [FATA] The number of master/control-plane cannot be 0
  • solution

Modify the cluster deployment file sample.yamland fill in correctly roleGroups.master: master node information

Question 2

  • Error message
Continue upgrading cluster? [yes/no]: yes
14:07:02 CST success: [LocalHost]
14:07:02 CST [SetUpgradePlanModule 1/2] Set upgrade plan
14:07:02 CST success: [LocalHost]
14:07:02 CST [SetUpgradePlanModule 1/2] Generate kubeadm config
14:07:02 CST message: [k8s-master-1]
Failed to get container runtime cgroup driver.: Failed to exec command: sudo -E /bin/bash -c "docker info | grep 'Cgroup Driver'"
/bin/bash: docker: command not found: Process exited with status 1
14:07:02 CST retry: [k8s-master-1]
14:07:07 CST message: [k8s-master-1]
Failed to get container runtime cgroup driver.: Failed to exec command: sudo -E /bin/bash -c "docker info | grep 'Cgroup Driver'"
/bin/bash: docker: command not found: Process exited with status 1
14:07:07 CST retry: [k8s-master-1]
14:07:12 CST message: [k8s-master-1]
Failed to get container runtime cgroup driver.: Failed to exec command: sudo -E /bin/bash -c "docker info | grep 'Cgroup Driver'"
/bin/bash: docker: command not found: Process exited with status 1
14:07:12 CST skipped: [k8s-master-3]
14:07:12 CST skipped: [k8s-master-2]
14:07:12 CST failed: [k8s-master-1]
error: Pipeline[UpgradeClusterPipeline] execute failed: Module[SetUpgradePlanModule 1/2] exec failed:
failed: [k8s-master-1] [GenerateKubeadmConfig] exec failed after 3 retries: Failed to get container runtime cgroup driver.: Failed to exec command: sudo -E /bin/bash -c "docker info | grep 'Cgroup Driver'"
/bin/bash: docker: command not found: Process exited with status 1
  • solution

Modify the cluster deployment file sample.yamland fill it in correctly kubernetes.containerManager: containerd. Docker is used by default.

7. Summary

This article demonstrates through actual combat the detailed process of upgrading Kubernetes minor versions of KubeSphere and Kubernetes clusters deployed by KubeKey, as well as the problems encountered during the upgrade process and the corresponding solutions. At the same time, it also explains what verifications are required before and after the upgrade to ensure the success of the system upgrade.

Using KubeKey to upgrade KubeSphere and Kubernetes and complete test verification is a complex and important task. Through this actual upgrade, we verified the technical feasibility, updated and optimized the system, and improved the performance and security of the system. At the same time, we have also accumulated valuable experience and lessons, which provide reference for future production environment upgrades.

To summarize, the full text mainly involves the following contents:

  • Upgrade preparation for actual combat environment
  • Kubernetes upgrade preparation and upgrade process monitoring
  • Upgrade Kubernetes using KubeKey
  • Kubernetes post-upgrade verification

The practical content provided in this article can be directly applied to test and development environments. At the same time, it also has certain reference value for the production environment. However, be careful not to apply this directly to a production environment.

This article is published by OpenWrite, a blog that publishes multiple articles !

IntelliJ IDEA 2023.3 & JetBrains Family Bucket annual major version update new concept "defensive programming": make yourself a stable job GitHub.com runs more than 1,200 MySQL hosts, how to seamlessly upgrade to 8.0? Stephen Chow's Web3 team will launch an independent App next month. Will Firefox be eliminated? Visual Studio Code 1.85 released, floating window Yu Chengdong: Huawei will launch disruptive products next year and rewrite the history of the industry. The US CISA recommends abandoning C/C++ to eliminate memory security vulnerabilities. TIOBE December: C# is expected to become the programming language of the year. A paper written by Lei Jun 30 years ago : "Principle and Design of Computer Virus Determination Expert System"
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4197945/blog/10320906