Kubernetes基础:问题排查方法示例:结合使用kubectl get event

这篇文章以Metrics Server部署时碰到的问题为例,介绍如何结合使用kubectl get event命令进行问题的排查。

Metrics Server部署

Metrics Server是Kubernetes 1.8以后指标监控使用的组件,诸如dashboard和kubectl top等都需要使用到它,到目前需要手动安装,安装非常简单,一般如下命令即可

步骤1: git clone

执行命令:git clone https://github.com/kubernetes-incubator/metrics-server

步骤2: kubectl create

执行命令:cd metrics-server && kubectl create -f deploy/1.8+/

问题现象

metrics-server显示READY的状态为0/1

[root@host131 1.8+]# kubectl get deployment -A
NAMESPACE     NAME             READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   coredns          1/1     1            1           2m50s
kube-system   metrics-server   0/1     0            0           25s
[root@host131 1.8+]#

说明此时pod状态未Ready,查询全命令空间只有coredns的pod在运行

[root@host131 1.8+]# kubectl get pod -A
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   coredns-59db588569-gz6x8   1/1     Running   0          7m46s
[root@host131 1.8+]# 

问题排查

使用kubectl get event确认,加上-A选项或者指定命名空间比如此例中为kube-system,可以获取到如下错误信息

5m54s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-gzbl5" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-c5jxg" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-4t4wt" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-zv9hg" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-vqm9m" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-l9bdw" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-x7v2r" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-4p4v9" is forbidden: SecurityContext.RunAsUser is forbidden
5m52s       Warning   FailedCreate        replicaset/metrics-server-789c77976    Error creating: pods "metrics-server-789c77976-4rdtb" is forbidden: SecurityContext.RunAsUser is forbidden
26s         Warning   FailedCreate        replicaset/metrics-server-789c77976    (combined from similar events): Error creating: pods "metrics-server-789c77976-442xf" is forbidden: SecurityContext.RunAsUser is forbidden

只要有清晰的错误提示,一般就可以判断出原因,比如此处可以看到
pods "metrics-server-789c77976-442xf" is forbidden: SecurityContext.RunAsUser is forbidden
说明pod在创建的过程中准入没有通过,准入控制的SecurityContext有问题,然后确认kube-apiserver的参数(systemctl status -l kube-apiserver)

--enable-admission-plugins=NamespaceLifecycle,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota,NodeRestriction

准入控制SecurityContextDeny的作用就是为了限制使用了SecurityContext,SecurityContext可以在容器中定义uid、gid与SELinux等操作系统级别的安全设定,正是由于这个准入控制的存在导致了此Pod根本没有进入到进行pod的镜像的拉取直接失败,导致根本没有pod的信息。

对应方法

删除准入控制中的SecurityContextDeny,然后重启ApiServer,再次创建就会发现至少出现ContainerCreating的状态了(至于pull不下来则是别的问题,此处不再赘述)

[root@host131 1.8+]# kubectl get pods -A
NAMESPACE     NAME                             READY   STATUS              RESTARTS   AGE
kube-system   coredns-59db588569-gz6x8         1/1     Running             0          26m
kube-system   metrics-server-789c77976-sfvsx   0/1     ContainerCreating   0          10s
[root@host131 1.8+]#

总结

问题的排查关键在于重要提示信息的获取,在kubernetes中,日志、systemctl status状态信息和kubeclt get event等命令合理结合才能准确和全面的获取包含问题原因的信息,另外各个组件的信息最好都要插件,比如此例中在controller-manager中也可以看到类似的错误信息。

发布了1058 篇原创文章 · 获赞 1292 · 访问量 399万+

猜你喜欢

转载自blog.csdn.net/liumiaocn/article/details/104140697