Follow the [Cloud Native Treasure Box] official account to quickly master cloud native
This article is a guide on using Kubectl for Kubernetes diagnostics. The author lists 100 Kubectl commands that are useful for diagnosing issues in Kubernetes clusters. These issues include but are not limited to:
- Cluster information
- Pod diagnostics
- Service diagnostics
- Deployment diagnostics
- Network diagnostics
- Persistent Volumes and Persistent Volume Claims Diagnostics
- Resource usage
- Security and authorization
- Node troubleshooting
- Other diagnostic commands: The article also mentions many other commands, such as resource expansion and auto-expansion, job and scheduled job diagnostics, Pod affinity and anti-affinity rules, RBAC and security, service account diagnostics, node draining and unscheduling Empty, resource cleanup, etc.
Cluster information:
- Show Kubernetes version:
kubectl version
- Display cluster information:
kubectl cluster-info
- List all nodes in the cluster:
kubectl get nodes
- View details of a specific node:
kubectl describe node <node-name>
- List all namespaces:
kubectl get namespaces
- List all pods in all namespaces:
kubectl get pods --all-namespaces
Pod diagnostics:
- List pods in a specific namespace:
kubectl get pods -n <namespace>
- View details of a Pod:
kubectl describe pod <pod-name> -n <namespace>
- View Pod logs:
kubectl logs <pod-name> -n <namespace>
- Tail Pod log:
kubectl logs -f <pod-name> -n <namespace>
- Execute the command in the pod:
kubectl exec -it <pod-name> -n <namespace> -- <command>
Pod health check:
- Check Pod readiness:
kubectl get pods <pod-name> -n <namespace> -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
- Check Pod events:
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>
Service diagnosis:
- List all services in the namespace:
kubectl get svc -n <namespace>
- View details of a service:
kubectl describe svc <service-name> -n <namespace>
Deployment diagnostics:
- List all Deployments in the namespace:
kubectl get deployments -n <namespace>
- View a Deployment details:
kubectl describe deployment <deployment-name> -n <namespace>
- View rolling release status:
kubectl rollout status deployment/<deployment-name> -n <namespace>
- View rolling release history:
kubectl rollout history deployment/<deployment-name> -n <namespace>
StatefulSet diagnostics:
- List all StatefulSets in the namespace:
kubectl get statefulsets -n <namespace>
- View the details of a StatefulSet:
kubectl describe statefulset <statefulset-name> -n <namespace>
ConfigMap and Secret diagnostics:
- List ConfigMap in namespace:
kubectl get configmaps -n <namespace>
- View details of a ConfigMap:
kubectl describe configmap <configmap-name> -n <namespace>
- List Secrets in a namespace:
kubectl get secrets -n <namespace>
- View the details of a Secret:
kubectl describe secret <secret-name> -n <namespace>
Namespace diagnostics:
- View details of a namespace:
kubectl describe namespace <namespace-name>
Resource usage:
- Check a pod's resource usage:
kubectl top pod <pod-name> -n <namespace>
- Check node resource usage:
kubectl top nodes
Network diagnostics:
- Display the IP addresses of Pods in the namespace:
kubectl get pods -n <namespace> -o custom-columns=POD:metadata.name,IP:status.podIP --no-headers
- List all network policies in the namespace:
kubectl get networkpolicies -n <namespace>
- View details of a network policy:
kubectl describe networkpolicy <network-policy-name> -n <namespace>
Persistent Volume (PV) and Persistent Volume Claim (PVC) diagnostics:
- List PVs:
kubectl get pv
- View details of a PV:
kubectl describe pv <pv-name>
- List PVCs in a namespace:
kubectl get pvc -n <namespace>
- View PVC details:
kubectl describe pvc <pvc-name> -n <namespace>
Node diagnostics:
- Get a list of Pods running on a specific node:
kubectl get pods --field-selector spec.nodeName=<node-name> -n <namespace>
Resource quotas and limits:
- List resource quotas in a namespace:
kubectl get resourcequotas -n <namespace>
- View a resource quota details:
kubectl describe resourcequota <resource-quota-name> -n <namespace>
Custom resource definition (CRD) diagnostics:
- List custom resources in a namespace:
kubectl get <custom-resource-name> -n <namespace>
- View custom resource details:
kubectl describe <custom-resource-name> <custom-resource-instance-name> -n <namespace>
When using a trivial command, read the command<namespace>
, <pod-name>
, <service-name>
, <deployment-name>
, <statefulset-name>
, <configmap-name>
, <secret-name>
, <namespace-name>
, <pv-name>
, <pvc-name>
, <node-name>
, <network-policy-name>
, <resource-quota-name>
, <custom-resource-name>
, Specification for Japanese exchange.
<custom-resource-instance-name>
These commands should help you diagnose your Kubernetes cluster and the applications running in it.
Resource scaling and automatic scaling
- Deployment scaling:
kubectl scale deployment <deployment-name> --replicas=<replica-count> -n <namespace>
- Set the automatic scaling of the Deployment:
kubectl autoscale deployment <deployment-name> --min=<min-pods> --max=<max-pods> --cpu-percent=<cpu-percent> -n <namespace>
- Check the retractor status:
kubectl get hpa -n <namespace>
Job and CronJob diagnostics:
- List all jobs in a namespace:
kubectl get jobs -n <namespace>
- View details of a job:
kubectl describe job <job-name> -n <namespace>
- List all cron jobs in the namespace:
kubectl get cronjobs -n <namespace>
- View details of a cron job:
kubectl describe cronjob <cronjob-name> -n <namespace>
Capacity Diagnosis:
- List persistent volumes (PVs) sorted by capacity:
kubectl get pv --sort-by=.spec.capacity.storage
- View PV recycling strategies:
kubectl get pv <pv-name> -o=jsonpath='{.spec.persistentVolumeReclaimPolicy}'
- List all storage classes:
kubectl get storageclasses
Ingress and service mesh diagnostics:
- List all Ingresses in the namespace:
kubectl get ingress -n <namespace>
- View the details of an Ingress:
kubectl describe ingress <ingress-name> -n <namespace>
- List all VirtualServices (Istio) in the namespace:
kubectl get virtualservices -n <namespace>
- View details of a VirtualService (Istio):
kubectl describe virtualservice <virtualservice-name> -n <namespace>
Troubleshooting Pod Networking:
- Run a network diagnostic pod (such as busybox) to debug:
kubectl run -it --rm --restart=Never --image=busybox net-debug-pod -- /bin/sh
- Test connectivity from a Pod to a specific endpoint:
kubectl exec -it <pod-name> -n <namespace> -- curl <endpoint-url>
- Trace the network path from one Pod to another:
kubectl exec -it <source-pod-name> -n <namespace> -- traceroute <destination-pod-ip>
- Check the Pod's DNS resolution:
kubectl exec -it <pod-name> -n <namespace> -- nslookup <domain-name>
Configuration and resource verification:
- Validate a Kubernetes YAML file without applying it:
kubectl apply --dry-run=client -f <yaml-file>
- Verify the pod's security context and capabilities:
kubectl auth can-i list pods --as=system:serviceaccount:<namespace>:<serviceaccount-name>
RBAC and security:
- List roles and role bindings in a namespace:
kubectl get roles,rolebindings -n <namespace>
- View character or character binding details:
kubectl describe role <role-name> -n <namespace>
Service account diagnostics:
- List the service accounts in the namespace:
kubectl get serviceaccounts -n <namespace>
- View a service account details:
kubectl describe serviceaccount <serviceaccount-name> -n <namespace>
Clear nodes and unblock:
- Clear the node for maintenance:
kubectl drain <node-name> --ignore-daemonsets
- Unblock a node:
kubectl uncordon <node-name>
Resource cleanup:
- Forcefully delete a pod (not recommended):
kubectl delete pod <pod-name> -n <namespace> --grace-period=0 --force
Pod affinity and anti-affinity:
- List pod affinity rules for a pod:
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity}'
- List pod anti-affinity rules for a pod:
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity.podAntiAffinity}'
Pod Security Policy (PSP):
- List all Pod security policies (if enabled):
kubectl get psp
event:
- View recent cluster events:
kubectl get events --sort-by=.metadata.creationTimestamp
- Filter events by a specific namespace:
kubectl get events -n <namespace>
Node troubleshooting:
- Check the node status:
kubectl describe node <node-name> | grep Conditions -A5
- List node capacity and allocable resources:
kubectl describe node <node-name> | grep -E "Capacity|Allocatable"
Ephemeral containers (Kubernetes 1.18+):
- Run a temporary debug container:
kubectl debug -it <pod-name> -n <namespace> --image=<debug-image> -- /bin/sh
Resource metrics (requires metrics server):
- Get the CPU and memory usage of a Pod:
kubectl top pod -n <namespace>
kuelet diagnostics:
- View the kubelet log on the node:
kubectl logs -n kube-system kubelet-<node-name>
Advanced debugging using Telepresence:
- Debugging pods using Telepresence:
telepresence --namespace <namespace> --swap-deployment <pod-name>
Kubeconfig and context:
- List available contexts:
kubectl config get-contexts
- Switch to a different context:
kubectl config use-context <context-name>
Pod security standard (PodSecurity admission controller):
- List PodSecurityPolicy (PSP) violations:
kubectl get psp -A | grep -vE 'NAME|REVIEWED'
Pod Disruption Budget (PDB) diagnostics:
- List all PDBs in the namespace:
kubectl get pdb -n <namespace>
- View details of a PDB:
kubectl describe pdb <pdb-name> -n <namespace>
Resource lock diagnostics (if using resource locks):
- List resource locks in a namespace:
kubectl get resourcelocks -n <namespace>
Service endpoints and DNS:
- List service endpoints for a service:
kubectl get endpoints <service-name> -n <namespace>
- Check the DNS configuration in the Pod:
kubectl exec -it <pod-name> -n <namespace> -- cat /etc/resolv.conf
Custom indicators (Prometheus, Grafana):
- Query Prometheus indicators: Used
kubectl port-forward
to access Prometheus and Grafana services to query custom indicators.
Pod priority and preemption:
- List priorities:
kubectl get priorityclasses
Pod overhead (Kubernetes 1.18+):
- List the overhead in a pod:
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.overhead}'
Storage volume snapshot diagnostics (if using storage volume snapshots):
- List storage volume snapshots:
kubectl get volumesnapshot -n <namespace>
- View storage volume snapshot details:
kubectl describe volumesnapshot <snapshot-name> -n <namespace>
Resource deserialization diagnostics:
- Deserialize and print Kubernetes resources:
kubectl get <resource-type> <resource-name> -n <namespace> -o=json
Node taint:
- List node taints:
kubectl describe node <node-name> | grep Taints
Change and verify webhook configuration:
- List variant webhook configurations:
kubectl get mutatingwebhookconfigurations
- List the authentication webhook configuration:
kubectl get validatingwebhookconfigurations
Pod network policy:
- List pod network policies in a namespace:
kubectl get networkpolicies -n <namespace>
Node conditions (Kubernetes 1.17+):
- Custom query output:
kubectl get nodes -o custom-columns=NODE:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status -l 'node-role.kubernetes.io/worker='
Audit log:
- Retrieve audit logs (if enabled): Check the Kubernetes audit log configuration to learn the location of the audit logs.
Node operating system details:
- Get the operating system information of the node:
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.osImage}'
These commands should cover various diagnostic scenarios in Kubernetes. Make sure to replace placeholders like <namespace>
, <pod-name>
, <deployment-name>
, etc. with actual values for your cluster and use case.