K8S集群中Pod挂载Storageclass存储卷异常排查思路

K8S集群中Pod挂载Storageclass存储卷异常排查思路

故障描述:

Jenkins是在K8S集群中部署的,Jenkins使用的各种资源以及全部创建了,但是Jenkins的Pod依旧无法启动,一直处于Pending状态。

排查思路:

1)首先查看处于Pending状态的原因,观察Pod的详细信息,获取关键信息。

Warning  FailedMount  34s (x3 over 98s)  kubelet  (combined from similar events): MountVolume.SetUp failed for volume "pvc-3ed2c605-b7da-4266-a882-4527ed949c34" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/a794f97d-7e67-47f5-9ab2-05f6352b9352/volumes/kubernetes.io~nfs/pvc-3ed2c605-b7da-4266-a882-4527ed949c34 --scope -- mount -t nfs 192.168.16.105:/data/k8s/storageclass/kube-system-prometheus-data-prometheus-0-pvc-3ed2c605-b7da-4266-a882-4527ed949c34 /var/lib/kubelet/pods/a794f97d-7e67-47f5-9ab2-05f6352b9352/volumes/kubernetes.io~nfs/pvc-3ed2c605-b7da-4266-a882-4527ed949c34
Output: Running scope as unit run-2832.scope.
mount.nfs: mounting 192.168.16.105:/data/k8s/storageclass/kube-system-prometheus-data-prometheus-0-pvc-3ed2c605-b7da-4266-a882-4527ed949c34 failed, reason given by server: No such file or directory

2)通过日志中可以看到会输出大量关于PVC挂载的报错信息,首先来看这句话:

 MountVolume.SetUp failed for volume "pvc-3ed2c605-b7da-4266-a882-4527ed949c34" : mount failed: exit status 32

意思是说无法挂载这个pvc-3ed2c605-b7da-4266-a882-4527ed949c34PVC卷,已经存在了。

紧接着看这句话:

mounting 192.168.16.105:/data/k8s/storageclass/kube-system-prometheus-data-prometheus-0-pvc-3ed2c605-b7da-4266-a882-4527ed949c34 failed, reason given by server: No such file or directory

意思就是说这个pvc-3ed2c605-b7da-4266-a882-4527ed949c34PVC卷在NFS中是/data/k8s/storageclass/kube-system-prometheus-data-prometheus-0-pvc-3ed2c605-b7da-4266-a882-4527ed949c34这个路径,但是现在这个路径在NFS中已经不存在了。

3)通过Pod输出的日志,我们基本上就可以定位问题所在了,Pod要挂载Storageclass创建的PVC卷,挂载的这个PVC卷原来可能创建过,但是存储空间被删除了,从而导致无法被挂载,

4)产生这种情况90%的原因有以下几种:

  • 之前在K8S集群中部署过Jenkins,并且也是使用的Storageclass作为持久化存储,后来因为某些原因将Jenkins服务在K8S集群中删除了,同时也将PVC再NFS上的的持久化路径也删除了。

  • 运维人员将NFS中Jenkins PVC的存储路径删除了,导致Jenkins重新启动时找不到存储。

5)最有可能的原因就是之前部署过Jenkins,删除重建时,也将存储路径删除了。

6)既然全部都删除了,为什么Storageclass不再重新创建一个新的PVC呢?原因其实也简单,我们虽然只是将存储数据的目录删除了,但是并没有删除Storageclass创建的PVC啊,当Jenkins再次重新部署时,Storageclass发现之前为Jenkins创建过PVC,那么就可以接着使用,没必要再创建一个新的PVC,占用系统资源。

7)解决方法很简单,只需要将之前Jenkins创建的PVC删除,然后重新部署Jenkins就可以了。

可以看到有很多Jenkins的PV资源,找到对应的PVC,先删除PVC,再删除PV即可解决。

# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                      STORAGECLASS              REASON   AGE
nginx-conf-pv                              1Gi        RWX            Retain           Bound      default/nginx-conf-pvc                                                        30d
pvc-246774a7-11d8-4537-8b44-5c067ba80d04   10Gi       RWX            Retain           Released   grafana/grafana-data-grafana-0             grafana-storageclass               11d
pvc-3ed2c605-b7da-4266-a882-4527ed949c34   10Gi       RWX            Retain           Bound      kube-system/prometheus-data-prometheus-0   prometheus-storageclass            11d
pvc-7a013e82-e454-4883-a99e-81d23498b63e   10Gi       RWX            Retain           Released   jenkins/gitlab-data-gitlab-0               gitlab-storageclass                24d
pvc-94284330-69cd-48b5-908d-a6de8c41ffaa   10Gi       RWX            Retain           Released   grafana/grafana-data-grafana-0             grafana-storageclass               11d
pvc-991ee7c5-6752-44ba-9cd7-d55e5018f883   10Gi       RWX            Retain           Bound      jenkins/jenkins-data-jenkins-master-0      jenkins-storageclass               23d
pvc-99342491-daa6-43df-9c4c-827478926ced   10Gi       RWX            Retain           Bound      jenkins/gitlab-data-gitlab-0               gitlab-storageclass                24d
pvc-abc35446-059e-4916-a5f8-327e5cdc4954   1Gi        RWX            Retain           Released   jenkins/gitlab-config-gitlab-0             gitlab-storageclass                24d
pvc-cf1acd07-4a0e-4340-8626-01a857a9cee6   1Gi        RWX            Retain           Released   jenkins/gitlab-config-gitlab-0             gitlab-storageclass                24d
pvc-cf93289a-1cd3-4a3b-aac2-749eea30342a   10Gi       RWX            Retain           Released   jenkins/gitlab-data-gitlab-0               gitlab-storageclass                24d
pvc-ddc811c2-67ff-42cc-a163-c82c3dadc1be   1Gi        RWX            Retain           Bound      jenkins/gitlab-config-gitlab-0             gitlab-storageclass                24d
pvc-e3af12d8-a34e-4110-881c-e070cdb3e847   10Gi       RWX            Retain           Released   grafana/gitlab-data-grafana-0              gitlab-storageclass                11d

猜你喜欢

转载自blog.csdn.net/weixin_44953658/article/details/126802558