[Cloud native • Monitoring] Prometheus-based cloud native cluster monitoring (theory + practice)-01

[Cloud native • Monitoring] Prometheus-based cloud native cluster monitoring (theory + practice)-01

foreword

"The author has built a temporary environment on the public cloud, you can log in to experience it first:"

http://124.222.45.207:17000/login
账号:root/root.2020

Cloud Native Monitoring Challenges

Prometheus is written in Go language and has been open source from the beginning. In 2016, Prometheus became the second member of CNCF after Kubernetes. The popularity of Prometheus in recent years is inseparable from the increasing popularity of cloud native, and it has now become the de facto standard for monitoring in the cloud native ecosystem.

Zabbix, the traditional monitoring overlord, mainly faces the following challenges when facing cloud-native monitoring:

  1. The number of monitoring objects is increasing: traditional monitoring is based on the granularity of a single application, combined with computing, storage, network and other infrastructure monitoring for operation and maintenance guarantee. However, under the containerized microservice architecture, the monitoring granularity is as fine as the container POD or microservice API level, which makes the number of monitored objects exponentially increase compared with the single application.

  2. Massive monitoring indicators:

  • There are many types of resource objects, such as: Container, Pod, Service, Deployment, ReplicaSet, Endpoint, Ingress, PV, PVC, etc.;

  • A cloud-native cluster may have tens of thousands or even hundreds of thousands of Pods;

  • A cloud-native cluster may have tens of thousands of levels of Service, Endpoint, Deployment, Ingress, etc.;

  • Massive monitoring indicators have high requirements on data writing and query performance, as well as optimization of storage space occupancy, etc., in order to carry massive monitoring resources

Dynamic changes of objects are relatively frequent, the expansion and contraction of resource objects, the life cycle of resource objects is greatly shortened, and some objects even live and die, etc., which will bring two problems:

  • The collection target cannot use the traditional static configuration method, but must be able to perceive changes in a timely manner based on the service discovery mechanism and make rapid adjustments;

  • The greatly shortened resource object life cycle caused by resource object expansion and contraction can easily lead to rapid expansion of the number of indicators, which will affect the performance of the entire monitoring system over time;

Zabbix appeared earlier. At that time, the container had not yet been born. Naturally, the support for the container was relatively poor. However, Prometheus's TSDB time series data storage mechanism and rich service discovery mechanism were basically tailored for cloud native. Prometheus began to become a container. It is standard in monitoring and will be widely used in the foreseeable future.

Cloud Native Monitoring Solution

kubernetesCloud-native clusters are very complex. In summary, we mainly focus on the following five indicators:

  1. Container basic resource indicators: performance indicators related to the external environment of component running. In traditional scenarios, the external environment of component running is the host, while in the cloud native environment, the external environment of components is the container. We need to pay attention to the CPU, memory, storage, disk IO, Related indicators such as network IO, similarly related indicators of basic container resources, such as: Container CPU usage, memory usage, storage space, disk read and write IO, and network IO.

  2. K8s resource object indicators: Containers are the lowest runtime components. As a powerful platform for scheduling and coordinating these containers, k8s abstractly defines many resource objects, such as Pod, Service, Deployment, ReplicaSet, DaemonSet, Ingress, StatefulSet, ConfigMap, ServiceAccount, etc., the resource object indicators are to monitor the metadata information defined by these k8s.

  3. k8s service component indicators: as a complex cluster, k8s itself has many components, such as running on the master node: api-server components, etcd components, kube-scheduler components, kube-controller-manager components, coredns components, etc., node nodes There are kubelet components, kube-proxy components, etc. As the operation and maintenance personnel of cloud-native clusters, they must pay attention to the operation status of these core components to avoid performance bottlenecks and abnormal crashes of certain components, which will lead to low performance or even unavailability of the entire cloud-native cluster. .

  4. K8s cluster Node indicators: various components and business containers of cloud-native clusters are ultimately run on Node nodes. Therefore, the performance and abnormal conditions of Node nodes have a great impact on the entire cloud-native cluster. Therefore, we It is also necessary to pay special attention to Node node performance indicators.

  5. Cloud-native upper-layer business indicators: The above-mentioned are mainly related to the underlying indicators of cloud-native clusters. As a platform, cloud-native naturally deploys many business components, such as PaaS middleware, business application services, etc. These components also exist on the upper layer of cloud-native Related indicator monitoring.

Environmental preparation

In order to monitor the above cloud-native indicators, first, we need to prepare Prometheusthe Grafanaenvironment.

Prometheus deployment

1. For the convenience of management, we will install all resource objects related to monitoring under monitoringthis namespace, if not, you can create them in advance:

[root@k8s-01 prometheus]# kubectl create ns monitoring 
namespace/monitoring created

2. In order to manage the configuration files conveniently, here we manage prometheus.ymlthe configuration files in the form of:ConfigMap

# prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
      - targets: ['localhost:9090']

For the time being, we only configure prometheusthe monitoring of itself, and directly create the resource object:

[root@k8s-01 prometheus]# kubectl apply -f prometheus-config.yaml 
configmap/prometheus-config created

The configuration file is created. If we have new resources to be monitored in the future, we only need to update the above ConfigMap object.

3. Now let's create the resources prometheusof Pod:

# prometheus-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: monitoring
  labels:
    app: prometheus
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      containers:
        - image: prom/prometheus:v2.31.1
          name: prometheus
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
            - "--storage.tsdb.path=/prometheus" # 指定tsdb数据路径
            - "--storage.tsdb.retention.time=24h"
            - "--web.enable-admin-api" # 控制对admin HTTP API的访问,其中包括删除时间序列等功能
            - "--web.enable-lifecycle" # 支持热更新,直接执行localhost:9090/-/reload立即生效
          ports:
            - containerPort: 9090
              name: http
          volumeMounts:
            - mountPath: "/etc/prometheus"
              name: config-volume
            - mountPath: "/prometheus"
              name: data
          resources:
            requests:
              cpu: 200m
              memory: 1024Mi
            limits:
              cpu: 200m
              memory: 1024Mi
        - image: jimmidyson/configmap-reload:v0.4.0  #prometheus配置动态加载
          name: prometheus-reload
          securityContext:
            runAsUser: 0
          args:
            - "--volume-dir=/etc/config"
            - "--webhook-url=http://localhost:9090/-/reload"
          volumeMounts:
            - mountPath: "/etc/config"
              name: config-volume
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
            limits:
              cpu: 100m
              memory: 50Mi   
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: prometheus-data
        - configMap:
            name: prometheus-config
          name: config-volume

Persistence

In addition, for prometheusbetter performance and data persistence, we will directly use one LocalPVfor data persistence here. **Note that nfs must not be used to persist data (TSDB timing library does not support nfs storage, and there will be a risk of data loss)** , by specifying the data directory, create a resource object --storage.tsdb.path=/prometheusas shown below , note that it is one , and has affinity with the node:PVCLocalPVk8s-02

#prometheus-storage.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-local
  labels:
    app: prometheus
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 20Gi
  storageClassName: local-storage
  local:
    path: /data/k8s/prometheus  #确保主机节点上存在该目录
  persistentVolumeReclaimPolicy: Retain
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - k8s-02
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-data
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: prometheus
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: local-storage

The meaning here volumeBindingMode: WaitForFirstConsumeris delayed binding, when there is a PV that meets the requirements of the PVC, it is not bound immediately. Because the POD is associated with a PVC, and after binding, the POD is scheduled to other nodes. Obviously, other nodes may not have the PV so the POD hangs up. In addition, even if the node has a suitable PV, the POD is set so that it cannot run on There is no way for this node at this time. The advantage of delayed binding is that the scheduling of POD should refer to the distribution of volumes. When starting to schedule a POD, see where the LPV it requires is, then schedule to the node, then bind the PVC, and finally mount it to the POD, thus ensuring that the node where the POD is located must be where the LPV is located. of nodes. Therefore, delaying the binding of a PVC is to wait until the POD using this PVC appears on the scheduler (before it is actually scheduled), and then bind the PVC according to the comprehensive evaluation.

[root@k8s-01 prometheus]# kubectl get pvc -n monitoring
NAME              STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
prometheus-data   Pending                                      local-storage   4m9s

RBAC permissions

Since prometheus can access some resource objects of Kubernetes, rbac-related authentication needs to be configured. Here we use a serviceAccount object named prometheus:

# rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
  - apiGroups:
      - ""
    resources:
      - nodes
      - services
      - endpoints
      - pods
      - nodes/proxy
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "extensions"
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - configmaps
      - nodes/metrics
    verbs:
      - get
  - nonResourceURLs:
      - /metrics
    verbs:
      - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
  - kind: ServiceAccount
    name: prometheus
    namespace: monitoring

Since the resource information we want to obtain may exist in each of namespacethe following, so we use ClusterRolethe resource object here. It is worth mentioning that there is an nonResourceURLsattribute in our permission rule declaration here, which is used for non-resource Type metrics to operate the permission statement, which we have rarely encountered before, and then directly create the resource object above:

[root@k8s-01 prometheus]# kubectl apply -f rbac.yaml 
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
[root@k8s-01 prometheus]# kubectl get sa -n monitoring
NAME         SECRETS   AGE
default      1         30m
prometheus   1         21s

Now we can add the promethues resource object:

[root@k8s-01 prometheus]# kubectl apply -f prometheus-deploy.yaml
deployment.apps/prometheus created
[root@k8s-01 prometheus]# kubectl get pod -n monitoring                                
NAME                         READY   STATUS             RESTARTS   AGE
prometheus-5cdf864d9-z9bg4   0/1     CrashLoopBackOff   2          2m56s
[root@k8s-01 prometheus]# kubectl logs -f prometheus-5cdf864d9-z9bg4 -n monitoring
ts=2023-06-06T15:27:55.111Z caller=main.go:444 level=info msg="Starting Prometheus" version="(version=2.31.1, branch=HEAD, revision=411021ada9ab41095923b8d2df9365b632fd40c3)"
ts=2023-06-06T15:27:55.111Z caller=main.go:449 level=info build_context="(go=go1.17.3, user=root@9419c9c2d4e0, date=20211105-20:35:02)"
ts=2023-06-06T15:27:55.111Z caller=main.go:450 level=info host_details="(Linux 3.10.0-1127.10.1.el7.x86_64 #1 SMP Wed Jun 3 14:28:03 UTC 2020 x86_64 prometheus-5cdf864d9-z9bg4 (none))"
ts=2023-06-06T15:27:55.111Z caller=main.go:451 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2023-06-06T15:27:55.111Z caller=main.go:452 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2023-06-06T15:27:55.112Z caller=query_logger.go:87 level=error component=activeQueryTracker msg="Error opening query log file" file=/prometheus/queries.active err="open /prometheus/queries.active: permission denied"
panic: Unable to create mmap-ed active query log

goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker({0x7ffc2fc58e01, 0xb}, 0x14, {0x34442c0, 0xc0001371d0})
        /app/promql/query_logger.go:117 +0x3d7
main.main()
        /app/cmd/prometheus/main.go:491 +0x6bbf
[root@k8s-01 prometheus]#

File Permissions

open /prometheus/queries.active: permission deniedAfter creating the Pod, we can see that it did not run successfully, and such an error message appeared . This is because prometheusthe user nobody is used in our mirror image, and now we mount it LocalPVto the directory on the host through . :ownershiproot

[root@k8s-02 k8s]# ls -alh
总用量 0
drwxr-xr-x  4 root root  39 6月  18 22:14 .
drwxr-xr-x. 5 root root  50 6月   6 23:23 ..
drwxr-xr-x  6 root root 138 6月  19 01:00 prometheus

So of course there will be a problem of operating permissions. At this time, we can securityContextset the volumes permissions for the Pod through , and runAsUser=0specify the running user as root by setting :

containers:
  - image: prom/prometheus:v2.31.1
    name: prometheus
    securityContext:
      runAsUser: 0

You can also modify the data directory permissions by setting an initContainer:

......
initContainers:
- name: fix-permissions
  image: busybox
  command: [chown, -R, "nobody:nobody", /prometheus]
  volumeMounts:
  - name: data
    mountPath: /prometheus

At this time, we re-update prometheus:

[root@k8s-01 prometheus]# kubectl apply -f prometheus-deploy.yaml 
deployment.apps/prometheus created
[root@k8s-01 prometheus]# kubectl get pod -n monitoring
NAME                          READY   STATUS    RESTARTS   AGE
prometheus-675dd5dc5b-ks9k4   1/1     Running   0          9s
[root@k8s-01 prometheus]# kubectl logs -f prometheus-675dd5dc5b-ks9k4 -n monitoring
ts=2023-06-06T15:33:41.415Z caller=main.go:444 level=info msg="Starting Prometheus" version="(version=2.31.1, branch=HEAD, revision=411021ada9ab41095923b8d2df9365b632fd40c3)"
ts=2023-06-06T15:33:41.415Z caller=main.go:449 level=info build_context="(go=go1.17.3, user=root@9419c9c2d4e0, date=20211105-20:35:02)"
ts=2023-06-06T15:33:41.418Z caller=web.go:542 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2023-06-06T15:33:41.500Z caller=main.go:839 level=info msg="Starting TSDB ..."
ts=2023-06-06T15:33:41.505Z caller=main.go:869 level=info msg="TSDB started"
ts=2023-06-06T15:33:41.505Z caller=main.go:996 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
ts=2023-06-06T15:33:41.506Z caller=main.go:1033 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=592.24µs db_storage=1.487µs remote_storage=6.209µs web_handler=900ns query_engine=1.736µs scrape=288.029µs scrape_sd=26.451µs notify=1.151µs notify_sd=1.487µs rules=3.679µs
ts=2023-06-06T15:33:41.506Z caller=main.go:811 level=info msg="Server is ready to receive web requests."

After the Pod is successfully created, in order to be able to access the prometheus Web UI service externally, we also need to create a Service object:

# prometheus-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitoring
  labels:
    app: prometheus
spec:
  selector:
    app: prometheus
  type: NodePort
  ports:
    - name: web
      port: 9090
      targetPort: http

For the convenience of testing, we create a NodePorttype of service here. Of course, we can create an Ingressobject and access it through a domain name:

[root@k8s-01 prometheus]# kubectl apply -f prometheus-svc.yaml 
service/prometheus created
[root@k8s-01 prometheus]# kubectl get svc -n monitoring  -owide 
NAME         TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE   SELECTOR
prometheus   NodePort   10.96.219.107   <none>        9090:32478/TCP   20s   app=prometheus

Now we can http://任意节点IP:32478access the webui service of prometheus through:

0c44e792dae2ebf2a6889496f62f529e.png

This capture job is used to capture some monitoring data of prometheus itself. For example, we select process_resident_memory_bytesthis indicator here, and then click Executeto see the chart data similar to the following:

64f6d00edbd9a0f4e991e9a3a59d1c8a.png

Grafana deployment

Prometheus collected some monitoring data indicators in the Kubernetes cluster. We also tried to use promQL statements to query some data and displayed them in the Prometheus Dashboard. However, it is obvious that the chart function of Prometheus is relatively weak, so in general Next we will still use Grafana for demonstration, so we can install Grafana into the cluster.

1. Create an arrangement file:

# grafana-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      volumes:
        - name: storage
          persistentVolumeClaim:
            claimName: grafana-data
      containers:
        - name: grafana
          image: grafana/grafana:8.3.3
          imagePullPolicy: IfNotPresent
          securityContext:
            runAsUser: 0
          ports:
            - containerPort: 3000
              name: grafana
          env:
            - name: GF_SECURITY_ADMIN_USER
              value: admin
            - name: GF_SECURITY_ADMIN_PASSWORD
              value: admin
          readinessProbe:
            failureThreshold: 10
            httpGet:
              path: /api/health
              port: 3000
              scheme: HTTP
            initialDelaySeconds: 60
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 30
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /api/health
              port: 3000
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          resources:
            limits:
              cpu: 400m
              memory: 1024Mi
            requests:
              cpu: 200m
              memory: 512Mi
          volumeMounts:
            - mountPath: /var/lib/grafana
              name: storage
---
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
spec:
  type: NodePort
  ports:
    - port: 3000
  selector:
    app: grafana
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: grafana-local
  labels:
    app: grafana
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 1Gi
  storageClassName: local-storage
  local:
    path: /data/k8s/grafana #保证节点上创建好该目录
  persistentVolumeReclaimPolicy: Retain
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - k8s-02
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-data
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: grafana
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: local-storage

Two important environment variables GF_SECURITY_ADMIN_USERand GF_SECURITY_ADMIN_PASSWORDare used to configure the administrator user and password of grafana. Since grafana saves the dashboard and plug-in data under /var/lib/grafanathis directory, if we need to do data persistence here, we need to target This directory is used for volume mount declarations.

2. Create and verify that the Pod starts normally:

[root@k8s-01 prometheus]# kubectl apply -f grafana-deploy.yaml
[root@k8s-01 prometheus]# kubectl get pod -n monitoring
NAME                          READY   STATUS    RESTARTS   AGE
grafana-7cfd74ccf5-crcnz      1/1     Running   0          3m57s
prometheus-675dd5dc5b-ks9k4   1/1     Running   0          11d
[root@k8s-01 prometheus]# kubectl get svc -n monitoring
NAME         TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
grafana      NodePort   10.96.57.144    <none>        3000:32052/TCP   13m
prometheus   NodePort   10.96.219.107   <none>        9090:32478/TCP   11d

3. Visit grafana: http://192.168.31.160:32052

2b357a3803b7ca245a3137fa329546a6.png

4. Add Prometheus data source:

5f4139a17b139110e9a9392dbafa365d.png

Container Basic Resource Metrics

When it comes to container monitoring, we naturally think of it cAdvisor. I shared how to deploy components cAdvisorto monitor containers before Docker. cAdvisor(Container Advisor)It is Googlean open source container monitoring tool that can be used to monitor the usage and performance of container resources. It runs as a daemon to collect, aggregate, process and export information about running containers. Specifically, the component records resource isolation parameters, historical resource usage, histograms of full historical resource usage, and network statistics for each container.

cAdvisorIt is used to monitor the container engine. Due to its monitoring practicability, Kubernetesit has been built into kubeletthe component by default, so we don't need to deploy the component separately cAdvisor, and can directly use kubeletthe indicator collection address provided by the component.

cAdvisorThe data path is /api/v1/nodes/<node>/proxy/metrics, but we do not recommend using this method, because this method is api-serveraccessed through the proxy, which will api-servercause a lot of pressure on large-scale clusters, so we can directly obtain it by accessing the endpoint kubeletof data./metrics/cadvisorcAdvisor

cAdvisor access

The service discovery mode we use here node, because there are under each node kubelet, there are naturally cAdvisorcollected data indicators, the configuration is as follows:

- job_name: "cadvisor"
  kubernetes_sd_configs:
    - role: node
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
      replacement: $1
    - replacement: /metrics/cadvisor # <nodeip>/metrics -> <nodeip>/metrics/cadvisor
      target_label: __metrics_path__
  # 下面的方式不推荐使用
  # - target_label: __address__
  #   replacement: kubernetes.default.svc:443
  # - target_label: __metrics_path__
  #   replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

The cadvisor monitoring access is successful as shown in the figure below:

44b321b29c17c7b1a974e5830d940ffc.png

K8s node node index

Node Performance Key Indicators

1. Basic host information

node_uname_info:主机基本信息,包括架构、主机名、操作系统类型等

2. CPU usage:

(1 - avg(rate(node_cpu_seconds_total{mode="idle"}[$interval])) by (instance)) * 100

3. CPU load:

node_load1:1分钟负载
node_load5:5分钟负载
node_load15:15分钟负载

4. Memory usage:

100 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100

5. Disk indicators:

node_filesystem_size_bytes:磁盘总量
node_filesystem_avail_bytes:磁盘可用量

6. Disk I/O:

sum(rate(node_disk_read_bytes_total[1m]))     # 磁盘每秒读取量
sum(rate(node_disk_written_bytes_total[1m]))  # 磁盘每秒写入量

7. Network I/O:

sum(rate(node_network_receive_bytes_total[1m]))     # 网卡IO(进)
sum(rate(node_network_transmit_bytes_total[1m]))  # 网卡IO(出)

Node Monitoring Deployment

The performance monitoring of physical nodes is generally node_exporterobtained by means of monitoring cloud-native cluster nodes, which we also use here node_exporter. Since each node needs to obtain monitoring indicator data, we can deploy the service through the DaemonSet controller, so that each node will Automatically run a node-exporterPod, and if we delete or add nodes from the cluster, it will also automatically expand.

1. Create DaemonSetthe orchestration file of the controller node-exporter-daemonset.yaml:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    app: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      nodeSelector:
        kubernetes.io/os: linux
      containers:
        - name: node-exporter
          image: prom/node-exporter:v1.3.1
          args:
            - --web.listen-address=$(HOSTIP):9100
            - --path.procfs=/host/proc
            - --path.sysfs=/host/sys
            - --path.rootfs=/host/root
            - --no-collector.hwmon # 禁用不需要的一些采集器
            - --no-collector.nfs
            - --no-collector.nfsd
            - --no-collector.nvme
            - --no-collector.dmi
            - --no-collector.arp
            - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/containerd/.+|/var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
            - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
          ports:
            - containerPort: 9100
          env:
            - name: HOSTIP
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
          resources:
            requests:
              cpu: 150m
              memory: 200Mi
            limits:
              cpu: 300m
              memory: 400Mi
          securityContext:
            runAsNonRoot: true
            runAsUser: 65534
          volumeMounts:
            - name: proc
              mountPath: /host/proc
            - name: sys
              mountPath: /host/sys
            - name: root
              mountPath: /host/root
              mountPropagation: HostToContainer
              readOnly: true
      tolerations: # 添加容忍
        - operator: "Exists"
      volumes:
        - name: proc
          hostPath:
            path: /proc
        - name: dev
          hostPath:
            path: /dev
        - name: sys
          hostPath:
            path: /sys
        - name: root
          hostPath:
            path: /

Since the data we want to obtain is the monitoring indicator data of the host, and our node-exporteris running in the container, we Podneed to configure some Podsecurity policies in , here we add hostPID: true, hostIPC: true, hostNetwork: true3 policies to use the host And PID namespace, these namespaces are IPC namespacethe Network namespacekey technologies for container isolation.

In addition, use hostPaththe storage volume technology to mount the host's /dev, /proc, and /systhese directories into the container, because many of the node data we collect are obtained through the files under these folders. For example, we can view the topcurrent cpu usage by using the command In this case, the data comes from the file /proc/stat. freeYou can use the command to view the current memory usage, and the data source comes from /proc/meminfothe file.

2. DaemonSetCreated through the controller Pod:

kubectl apply -f  node-exporter-daemonset.yaml

3. Check whether the Pod is running normally:

[root@k8s-01 prometheus]# kubectl get pod -n kube-system -l app=node-exporter -owide
NAME                  READY   STATUS    RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
node-exporter-5brqt   1/1     Running   0          3m35s   192.168.31.162   k8s-03   <none>           <none>
node-exporter-hqzbp   1/1     Running   0          3m35s   192.168.31.161   k8s-02   <none>           <none>
node-exporter-tlc7r   1/1     Running   0          3m35s   192.168.31.160   k8s-01   <none>           <none>

After the deployment is complete, we can see that a node-exporter Pod is running on all three nodes, and the status Runningcan be PodIP:9100accessed to obtain node indicators:

curl http://192.168.31.160:9100/metrics
curl http://192.168.31.161:9100/metrics
curl http://192.168.31.162:9100/metrics

Since we specified it hostNetwork=true, PodIPin fact 节点IP, the specified one container port: 9100will also bind a port 9100 on each node:

[root@k8s-01 ~]# netstat -antp|grep 9100
tcp        0      0 192.168.31.160:9100     0.0.0.0:*               LISTEN      39239/node_exporter

4. Prometheus access configuration:

- job_name: kubernetes-nodes
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - source_labels: [__address__]
    regex: '(.*):10250'
    replacement: '${1}:9100'
    target_label: __address__
    action: replace
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)

After successful access, you can see that the collection is normal on the prometheus target page:

353ffe98d1da3c881b793f118dc80b1a.png

5. Import 8919 dashboard, and Kubernetesthe cloud-native cluster node performance monitoring indicators are displayed on the template, as shown in the following figure:

724e5fc0860217a9e4f968dfbf6f9c45.png

5fc11c9e08913f68cec6d13e84688437.gif

[For more cloud-native monitoring and operation and maintenance, please pay attention to the WeChat public account: cloud-native ecological laboratory]

Guess you like

Origin blog.csdn.net/god_86/article/details/131298048