Problem: Kubernetes installs Prometheus, and the kube-state-metrics container keeps reporting errors
Environment: Kubernetes 1.18
[root@k8s-master01 manifests]# kubectl logs -f kube-state-metrics-bdb8874fd-tnrrg -n monitoring -c kube-state-metrics
I0316 13:12:52.295699 1 main.go:86] Using default collectors
I0316 13:12:52.295788 1 main.go:98] Using all namespace
I0316 13:12:52.295798 1 main.go:139] metric white-blacklisting: blacklisting the following items:
W0316 13:12:52.295807 1 client_config.go:543] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0316 13:12:52.297186 1 main.go:184] Testing communication with server
F0316 13:13:22.298801 1 main.go:147] Failed to create client: error while trying to communicate with apiserver: Get https://10.96.0.1:443/version?timeout=32s: dial tcp 10.96.0.1:443: i/o timeout
analysis:
First, there are three containers in this kube-state-metrics-bdb8874fd-tnrrg. The problem is in kube-state-metrics. Cause the container to restart continuously
Look at the problem, it is unable to connect to the ip and port 10.96.0.1:443.
[root@k8s-master01 ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-metrics-cephfsplugin ClusterIP 10.96.218.238 <none> 8080/TCP 21h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 83d
nginx ClusterIP 10.96.215.251 <none> 80/TCP 16d
As seen by svc, this is the core ip and port of Kubernetes. But his side showed that it could not be connected.
have a test
This port is connected, but cannot be connected. The error shows that the io thread is delayed.
solve:
kube-state-metrics was originally installed on k8s-node02. Delete here, or rebuild k8s-node02.
Analyze the CPU of node02 is a bit high, I directly assign it to k8s-node01.
Then test, no io delay error is displayed