这篇文章以Metrics Server部署时碰到的问题为例,介绍如何结合使用kubectl get event命令进行问题的排查。
Metrics Server部署
Metrics Server是Kubernetes 1.8以后指标监控使用的组件,诸如dashboard和kubectl top等都需要使用到它,到目前需要手动安装,安装非常简单,一般如下命令即可
步骤1: git clone
执行命令:git clone https://github.com/kubernetes-incubator/metrics-server
步骤2: kubectl create
执行命令:cd metrics-server && kubectl create -f deploy/1.8+/
问题现象
metrics-server显示READY的状态为0/1
[root@host131 1.8+]# kubectl get deployment -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system coredns 1/1 1 1 2m50s
kube-system metrics-server 0/1 0 0 25s
[root@host131 1.8+]#
说明此时pod状态未Ready,查询全命令空间只有coredns的pod在运行
[root@host131 1.8+]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-59db588569-gz6x8 1/1 Running 0 7m46s
[root@host131 1.8+]#
问题排查
使用kubectl get event确认,加上-A选项或者指定命名空间比如此例中为kube-system,可以获取到如下错误信息
5m54s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-gzbl5" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-c5jxg" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-4t4wt" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-zv9hg" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-vqm9m" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-l9bdw" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-x7v2r" is forbidden: SecurityContext.RunAsUser is forbidden
5m53s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-4p4v9" is forbidden: SecurityContext.RunAsUser is forbidden
5m52s Warning FailedCreate replicaset/metrics-server-789c77976 Error creating: pods "metrics-server-789c77976-4rdtb" is forbidden: SecurityContext.RunAsUser is forbidden
26s Warning FailedCreate replicaset/metrics-server-789c77976 (combined from similar events): Error creating: pods "metrics-server-789c77976-442xf" is forbidden: SecurityContext.RunAsUser is forbidden
只要有清晰的错误提示,一般就可以判断出原因,比如此处可以看到
pods "metrics-server-789c77976-442xf" is forbidden: SecurityContext.RunAsUser is forbidden
说明pod在创建的过程中准入没有通过,准入控制的SecurityContext有问题,然后确认kube-apiserver的参数(systemctl status -l kube-apiserver)
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota,NodeRestriction
准入控制SecurityContextDeny的作用就是为了限制使用了SecurityContext,SecurityContext可以在容器中定义uid、gid与SELinux等操作系统级别的安全设定,正是由于这个准入控制的存在导致了此Pod根本没有进入到进行pod的镜像的拉取直接失败,导致根本没有pod的信息。
对应方法
删除准入控制中的SecurityContextDeny,然后重启ApiServer,再次创建就会发现至少出现ContainerCreating的状态了(至于pull不下来则是别的问题,此处不再赘述)
[root@host131 1.8+]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-59db588569-gz6x8 1/1 Running 0 26m
kube-system metrics-server-789c77976-sfvsx 0/1 ContainerCreating 0 10s
[root@host131 1.8+]#
总结
问题的排查关键在于重要提示信息的获取,在kubernetes中,日志、systemctl status状态信息和kubeclt get event等命令合理结合才能准确和全面的获取包含问题原因的信息,另外各个组件的信息最好都要插件,比如此例中在controller-manager中也可以看到类似的错误信息。