k8s集群部署成功后某个节点突然出现notready状态的问题原因分析和解决办法

1、问题描述

k8s集群配置为 一主+三个节点;刚开始运行一直正常;某天突然node03主机状态变为notready,问题如下:

在master节点使用:

#master节点查看节点工作状态
kubectl get nodes

在这里插入图片描述
出现node03节点的状态为NotReady。

2、查看node03的日志

在node03节点中使用一下命令查看报错信息,代码:

#node03节点查看日志
journalctl -f -u kubelet.service 

在这里插入图片描述
报错意思是不能加载kubelet配置文件!
报错意思是不能加载kubelet配置文件!
报错意思是不能加载kubelet配置文件!

-- Logs begin at 四 2023-12-21 15:25:07 CST. --
1222 01:01:00 tigerhhzz-node03-43 systemd[1]: Unit kubelet.service entered failed state.
1222 01:01:00 tigerhhzz-node03-43 systemd[1]: kubelet.service failed.
1222 01:01:10 tigerhhzz-node03-43 systemd[1]: kubelet.service holdoff time over, scheduling restart.
1222 01:01:10 tigerhhzz-node03-43 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
1222 01:01:10 tigerhhzz-node03-43 systemd[1]: Started kubelet: The Kubernetes Node Agent.
1222 01:01:10 tigerhhzz-node03-43 kubelet[121391]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
12月 22 01:01:10 tigerhhzz-node03-43 kubelet[121391]: F1222 01:01:10.301771  121391 server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
12月 22 01:01:10 tigerhhzz-node03-43 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
12月 22 01:01:10 tigerhhzz-node03-43 systemd[1]: Unit kubelet.service entered failed state.
12月 22 01:01:10 tigerhhzz-node03-43 systemd[1]: kubelet.service failed.
12月 22 01:01:20 tigerhhzz-node03-43 systemd[1]: kubelet.service holdoff time over, scheduling restart.
12月 22 01:01:20 tigerhhzz-node03-43 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
12月 22 01:01:20 tigerhhzz-node03-43 systemd[1]: Started kubelet: The Kubernetes Node Agent.
12月 22 01:01:20 tigerhhzz-node03-43 kubelet[121400]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
1222 01:01:20 tigerhhzz-node03-43 kubelet[121400]: F1222 01:01:20.508883  121400 server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
1222 01:01:20 tigerhhzz-node03-43 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
1222 01:01:20 tigerhhzz-node03-43 systemd[1]: Unit kubelet.service entered failed state.
1222 01:01:20 tigerhhzz-node03-43 systemd[1]: kubelet.service failed.
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: kubelet.service holdoff time over, scheduling restart.
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: Started kubelet: The Kubernetes Node Agent.
1222 01:01:30 tigerhhzz-node03-43 kubelet[121407]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
1222 01:01:30 tigerhhzz-node03-43 kubelet[121407]: F1222 01:01:30.820217  121407 server.go:198] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: Unit kubelet.service entered failed state.
1222 01:01:30 tigerhhzz-node03-43 systemd[1]: kubelet.service failed.

由日志信息可知,报错原因是不能从/var/llib/kubelet/config.yaml下载到kubelet的配置。

3、错误原因分析

可能node03主机自身某种原因,出现宕机后重启,然后在 kubeadm init初始化后没有加入node03节点到集群中,不能加载kubelet的配置文件/var/lib/kubelet/config.yaml,导致读取/var/llib/kubelet/config.yaml文件失败。

另外估计是我之前没有做 kubeadm init就运行了systemctl start kubelet。

4、解决办法

在master节点,重新生成token,然后尝试在node03问题节点上重新更新token。

## master节点操作
kubeadm token create --print-join-command

在这里插入图片描述
kubeadm join 192.168.162.31:6443 --token 6u1q3a.qxhb1wyjztsp34ty --discovery-token-ca-cert-hash sha256:967bbc3b30871241bbfd61e42ae5fa836e08111a5a43d63b319f028fdbc2241a

在node03节点运行一下代码:(尝试重新加入集群)

## node03节点操作
kubeadm join 192.168.162.31:6443 --token 6u1q3a.qxhb1wyjztsp34ty     --discovery-token-ca-cert-hash sha256:967bbc3b30871241bbfd61e42ae5fa836e08111a5a43d63b319f028fdbc2241a

出现以下情况表明成功加入:
在这里插入图片描述
此时查看node03 kubelet的状态

systemctl status kubelet

在这里插入图片描述
kubelet在node03节点成功运行,node03重新加入集群之后查看所有节点状态,。

继续返回master节点主机查看所有节点状态:

kubectl get nodes

在这里插入图片描述

所有节点状态为ready,问题解决!!!

猜你喜欢

转载自blog.csdn.net/weixin_43025151/article/details/135184187
今日推荐