【K8S】Common operations and common problems

1. Node troubleshooting process

1. 登录到服务器
ssh user@host
2. 输入密码
3. 切换到root用户
sudo su root
以下命令默认已经登录root用户,如果没有登录root用户请在命令前面加上sudo。

View all pod status

kubectl get pods -n [namespace]

View all service objects

kubectl get svc -n [namespace]

View node details

kubectl describe node [node name]

Check whether kubelet\docker is started

systemctl status kubelet
systemctl status docker

View the startup log of kubelet\docker

journalctl -u kubelet
journalctl -u docker

View disk usage

du -sh *
df -h

View the docker image

docker ps -a

2. Service troubleshooting process

View pod status

筛选指定pod
kubectl get pods -n [namespace] |grep xxx

View the service object

kubectl get svc -n [namespace]

View pod logs

# 打印从现在开始的日志
kubectl logs -f -l app=[svc name] -c [container name] -n [namespace]

# 打印从头开始的日志
kubectl logs -f [pod name] [container name] -n [namespace]

# 排除无关日志
kubeclt logs -f -l app=[svc name] -c [container name] -n [namespace] |grep -v actuator 

View pod details and events

kubectl describe pod [pod name] -n [namespace]

view configuration file

kubectl describe cm [svc name] -n [namespace]

view configmap object

kubectl get cm -n [namespace]

Adjust the number of component copies

kubectl scale --replicas=1 deployment/[deployment name] -n [namespace]

Enter container debugging

kubectl exec -it [pod name] -c [container name] -n [namespace] -- sh

3. Network troubleshooting process

Access other services from within the pod

检查 service name 是否存在
检查 service name 是否正确
nslookup\dig\ping 测试域名解析

Check EndPoint

kubectl get endpoints ${
    
    service name}
  • Check if there is an endpoint
  • Check whether the corresponding endpoint port is mapped correctly
kubectl get pods --selector=${
    
    service selector}
  • Check whether the corresponding pod is in the RUNNING state
  • Check if the service is actually available
  • Check whether the open port of the service is consistent with the configuration in Endpoint

4. Frequently Asked Questions

If the corresponding service in the list does not have a pod whose READY column is 0/1 or the STATUS column is not Running, it means that the service starts successfully.

1. The pod is in ContainerCreating

Check the cause of the error through describe:

kubectl describe pods [service name]

You can see a list similar to the following at the end:

Events:
  Type     Reason       Age                      From     Message
  ----     ------       ----                     ----     -------
  Warning  FailedMount  45m (x26 over 5h42m)     kubelet  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[default-token-ccgd7 storage]: timed out waiting for the condition
  Warning  FailedMount  7m11s (x124 over 5h51m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[storage default-token-ccgd7]: timed out waiting for the condition
  Warning  FailedMount  2m12s (x118 over 5h52m)  kubelet  MountVolume.WaitForAttach failed for volume "pvc-936ff918-76af-4eaa-aef3-64ae5ee15bfd" : rbd image sata_pool/kubernetes-dynamic-pvc-d6d869fc-57b8-442b-95fc-a67c06aeea03 is still being used

This table shows that our volume mount failed, and we need to solve the corresponding problem.

2. The pod is in the CrashLoopBackOff state.
The pod is in the CrashLoopBackOff state. It means that K8S tries to start the pod, but one or more containers in the pod fail to start. You can view the event information of the pod through describe

kubectl describe pods [service name]

3. The pod is in the RunContainerError state.
When the pod needs to use a ConfigMap that has not been created, the status will be displayed as RunContainerError. First use describe to view event information. There will be an event message similar to: configmaps xxxxxx not found.

kubectl describe pods [service name]

4. The pod is in the Evicted state
Evicted means expulsion. The Kubelet proactively monitors and prevents general shortages of computing resources. When there is a shortage of resources, the kubelet can actively terminate one or more Pods to reclaim the scarce resources. When the kubelet ends a Pod, it will terminate all containers in the Pod, and the Phase of the Pod will become Failed. If the evicted Pod is managed by the Deployment, the Deployment will create another Pod for Kubernetes to schedule. Evicted pods can be deleted directly.

sudo kubectl delete pods [pod name]

5. The service connection database in the pod is abnormal
Check the log in the pod

If the database connection is abnormal, check whether the configuration file in the pod is configured correctly

kubectl get configmap [pod name] -o yaml

If the ip pointed to by the host configured by k8s in the configuration file cannot be determined

kubectl describe service [service name]

6. Check pod resource usage

kubectl top pods | grep [podname]

7. How to locate if the service cannot be started
Common operations:

  • Check pod status and event
  • Check pod logs
  • Judging the problem according to the situation
sudo kubectl describe pods [pod] -n [namespace]
sudo kubectl logs -f [pod] -n [namespace]

8. It is necessary to expose the port to the outside
. Note that under normal circumstances, the production environment does not expose the service port to the outside, and the services communicate through ClusterIP.

sudo kubectl get svc -n 【namespace】 # 找到服务的service name
sudo kubectl edit svc 【svc】 -n 【namespace】  #编辑service,将ClusterIP修改为NodePort
sudo kubectl get svc 【svc】 -n 【namespace】 #查看对外暴露端口,端口一般大于30000

Guess you like

Origin blog.csdn.net/daohangtaiqian/article/details/131698342