[prometheus]-05 Node performance monitoring of Kubernetes cloud native monitoring

[prometheus]-04 Easy to handle Prometheus Eureka service discovery

2021-08-25

[prometheus]-03 Easy to handle Prometheus file service discovery

2021-08-23

[prometheus]-02 A picture to thoroughly understand the Prometheus service discovery mechanism

2021-08-18

【prometheus】- 01 Introduction to the monitoring system in the cloud native era

2021-08-16

Node Performance Monitoring for Kubernetes Cloud Native Monitoring

overview

PrometheusThe initial design is an open source monitoring & alarm tool for cloud-native applications. Before analyzing Kubernetesthe service discovery protocol, let's sort out Prometheushow to access cloud-native to monitor Kubernetesclusters .

KubernetesCloud-native cluster monitoring mainly involves the following three types of indicators: nodephysical node indicators, pod & containercontainer resource indicators, and Kubernetescloud-native cluster resource indicators. There are relatively mature solutions for these three types of indicators, as shown in the figure below:

environmental information

KubernetesThe cluster environment I built is as shown in the figure below, and the follow-up is based on the cluster demonstration:

node-exporter deployment

Physical node performance monitoring is generally node_exporterobtained through , node_exporterwhich is provided by Prometheusthe official website to collect various operating indicators of server nodes, and currently node_exportersupports almost all common monitoring points.

On Kubernetesthe cloud- native cluster, we can DaemonSetdeploy the service through the controller, so that each node in the cloud-native cluster will automatically run one of these Pod, and if we delete or add nodes from the cluster, it will also automatically expand.

1. Create the orchestration file of DaemonSetthe controller node-exporter-daemonset.yaml:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    name: node-exporter
spec:
  selector:
    matchLabels:
      name: node-exporter
  template:
    metadata:
      labels:
        name: node-exporter
    spec:
      hostPID: true
      hostIPC: true
      hostNetwork: true
      containers:
      - name: node-exporter
        image: prom/node-exporter
        ports:
        - containerPort: 9100
        resources:
          requests:
            cpu: 0.15
        securityContext:
          privileged: true
        args:
        - --path.procfs
        - /host/proc
        - --path.sysfs
        - /host/sys
        - --collector.filesystem.ignored-mount-points
        - '"^/(sys|proc|dev|host|etc)($|/)"'
        volumeMounts:
        - name: dev
          mountPath: /host/dev
        - name: proc
          mountPath: /host/proc
        - name: sys
          mountPath: /host/sys
        - name: rootfs
          mountPath: /rootfs
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Exists"
        effect: "NoSchedule"
      volumes:
        - name: proc
          hostPath:
            path: /proc
        - name: dev
          hostPath:
            path: /dev
        - name: sys
          hostPath:
            path: /sys
        - name: rootfs
          hostPath:
            path: /

2. DaemonSetCreated through the controller Pod:

kubectl create -f  node-exporter-daemonset.yaml

3. Check whether the Pod is running normally:

[root@master k8s-demo]# kubectl get pod -n kube-system -owide
NAME                                       READY   STATUS    RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
calico-kube-controllers-6c89d944d5-hg47n   1/1     Running   0          15d     10.100.219.68    master   <none>           <none>
calico-node-247w2                          1/1     Running   0          15d     192.168.52.151   master   <none>           <none>
calico-node-pt848                          1/1     Running   0          15d     192.168.52.152   node1    <none>           <none>
calico-node-z65m2                          1/1     Running   0          15d     192.168.52.153   node2    <none>           <none>
coredns-59c898cd69-f9858                   1/1     Running   0          15d     10.100.219.65    master   <none>           <none>
coredns-59c898cd69-ghbdg                   1/1     Running   0          15d     10.100.219.66    master   <none>           <none>
etcd-master                                1/1     Running   0          15d     192.168.52.151   master   <none>           <none>
kube-apiserver-master                      1/1     Running   1          15d     192.168.52.151   master   <none>           <none>
kube-controller-manager-master             1/1     Running   10         15d     192.168.52.151   master   <none>           <none>
kube-proxy-5thg7                           1/1     Running   0          15d     192.168.52.152   node1    <none>           <none>
kube-proxy-659zl                           1/1     Running   0          15d     192.168.52.153   node2    <none>           <none>
kube-proxy-p2vvz                           1/1     Running   0          15d     192.168.52.151   master   <none>           <none>
kube-scheduler-master                      1/1     Running   9          15d     192.168.52.151   master   <none>           <none>
kube-state-metrics-5f84848c58-v7v9z        1/1     Running   0          15d     10.100.166.135   node1    <none>           <none>
kuboard-74c645f5df-zzwnm                   1/1     Running   0          15d     10.100.104.2     node2    <none>           <none>
metrics-server-7dbf6c4558-qhjw4            1/1     Running   0          15d     192.168.52.152   node1    <none>           <none>
node-exporter-57djg                        1/1     Running   0          3m13s   192.168.52.152   node1    <none>           <none>
node-exporter-5kcnx                        1/1     Running   0          3m13s   192.168.52.151   master   <none>           <none>
node-exporter-cz45t                        1/1     Running   0          3m13s   192.168.52.153   node2    <none>           <none>

In the above node-exporter-xxxformat, a total of three are created Podon the three nodes of the cluster, and the status Running, we can obtain the performance indicators of the three nodes through the following links:

curl http://192.168.52.151:9100/metrics

curl http://192.168.52.152:9100/metrics

curl http://192.168.52.153:9100/metrics

token creation

node-exporterAfter the deployment is completed on Kubernetesthe cloud-native cluster, when the dependent Kubernetesservice discovery mechanism will node-exporterbe accessed , authentication is required for interaction .PrometheusKubernetestoken

1. Definition ServiceAccount, p8s_sa.yamlthe information is as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "extensions"
  resources:
    - ingresses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: kube-system 

2. Create ServiceAccount:

[root@master k8s-demo]# kubectl apply -f p8s_sa.yaml 
serviceaccount/prometheus created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
clusterrole.rbac.authorization.k8s.io/prometheus created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
clusterrolebinding.rbac.authorization.k8s.io/prometheus created

3. View ServiceAccountinformation:

[root@master k8s-demo]# kubectl get sa prometheus -n kube-system -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"prometheus","namespace":"kube-system"}}
  creationTimestamp: "2021-07-21T04:47:10Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:secrets:
        .: {}
        k:{"name":"prometheus-token-6hln9"}:
          .: {}
          f:name: {}
    manager: kube-controller-manager
    operation: Update
    time: "2021-07-21T04:47:10Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
    manager: kubectl-client-side-apply
    operation: Update
    time: "2021-07-21T04:47:10Z"
  name: prometheus
  namespace: kube-system
  resourceVersion: "113843"
  selfLink: /api/v1/namespaces/kube-system/serviceaccounts/prometheus
  uid: cbfe8330-de8f-40fd-a9b3-5aa312bb9104
secrets:
- name: prometheus-token-6hln9

4. secrets.nameObtain the secret key according to:

[root@master k8s-demo]# kubectl describe secret prometheus-token-6hln9 -n kube-system
Name:         prometheus-token-6hln9
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: prometheus
              kubernetes.io/service-account.uid: cbfe8330-de8f-40fd-a9b3-5aa312bb9104

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1066 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6Ikx6VHBOSXRwSmFCNmc2aXppS2tFeXFSTjlNVzJMNHhGX05fT3dLcXppSDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJwcm9tZXRoZXVzLXRva2VuLTZobG45Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6InByb21ldGhldXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJjYmZlODMzMC1kZThmLTQwZmQtYTliMy01YWEzMTJiYjkxMDQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06cHJvbWV0aGV1cyJ9.my_sEOjhx4hxApeGRhZmpFwK7snRKuYDyjlToYzXZSytdefPugMiHP1lA0bkDvxPiS0Pces2_hSJlB0pRDacqAgipE2_hqIx2GUO6t35mfbTthB7k4wbf9rQT4lag9XUzjdInOEV3SF4nfCG1DcbSM8a9COSXJUXkshXfollPYj1AGvAmTVYSSmK_b898z64WsDk9JNMjyM7VrI-kj20fKVgc0Ngi4kV3XKqRkCuKIZXKudmuUaqthbeVhaOKWhXzfBW2wDaVsNzsHMLqzwp8vVRIfZbudQ9gVGVZoskgRYiyNoNJcLjbphdxRN1hhWoBTITKHHFyQhwZGzTBo_f6g

5. tokeSave to token.k8sfile

eyJhbGciOiJSUzI1NiIsImtpZCI6Ikx6VHBOSXRwSmFCNmc2aXppS2tFeXFSTjlNVzJMNHhGX05fT3dLcXppSDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJwcm9tZXRoZXVzLXRva2VuLTZobG45Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6InByb21ldGhldXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJjYmZlODMzMC1kZThmLTQwZmQtYTliMy01YWEzMTJiYjkxMDQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06cHJvbWV0aGV1cyJ9.my_sEOjhx4hxApeGRhZmpFwK7snRKuYDyjlToYzXZSytdefPugMiHP1lA0bkDvxPiS0Pces2_hSJlB0pRDacqAgipE2_hqIx2GUO6t35mfbTthB7k4wbf9rQT4lag9XUzjdInOEV3SF4nfCG1DcbSM8a9COSXJUXkshXfollPYj1AGvAmTVYSSmK_b898z64WsDk9JNMjyM7VrI-kj20fKVgc0Ngi4kV3XKqRkCuKIZXKudmuUaqthbeVhaOKWhXzfBW2wDaVsNzsHMLqzwp8vVRIfZbudQ9gVGVZoskgRYiyNoNJcLjbphdxRN1hhWoBTITKHHFyQhwZGzTBo_f6g

Prometheus access

Next, we will access Kubernetesthe deployment completed above through the service discovery mechanism , and capture the performance indicators.node-exporterPrometheusnode-exporter

1. prometheus.ymlAdd a crawling task to the configuration job:

  - job_name: kubernetes-nodes
    kubernetes_sd_configs:
    - role: node
      api_server: https://apiserver.simon:6443
      bearer_token_file: /tools/token.k8s 
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: /tools//token.k8s
    tls_config:
      insecure_skip_verify: true
    relabel_configs:
    - source_labels: [__address__]
      regex: '(.*):10250'
      replacement: '${1}:9100'
      target_label: __address__
      action: replace
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)

Remark:

a. bearer_token_fileIt is to set the information file initialized in the previous step token;

b. api_serverYou can view /root/.kube/configthe file:


c, and configure in the /etc/hosts file:

192.168.52.151    apiserver.simon

2. Check whether the access is successful:


From Prometheus UIthe interface , we can see that the three nodes of the cloud-native cluster targetare connected in, and by retrieving the node memory indicators, it is found that the performance monitoring indicators of the three nodes have indeed been Prometheuscaptured successfully:


grafana deployment

node-exporterAfter the deployment of the cloud-native cluster is completed, and the performance monitoring indicators have been accessed Prometheusand successfully captured , the final step is now: deployment and display by retrieving the performance indicator data .nodeGrafanaPromQLPrometheus

1. Before building Grafana, we need to install and nfscreate :PVPVC

Execute on the master:

# 在master上安装nfs服务
[root@master ~]# yum install nfs-utils -y

# 准备一个共享目录
[root@master ~]# mkdir /data/k8s -pv

# 将共享目录以读写权限暴露给网段中的所有主机
[root@master ~]# vi /etc/exports
[root@master ~]# more /etc/exports
/data/k8s *(rw,no_root_squash,no_all_squash,sync)

# 启动nfs服务
[root@master ~]# systemctl start nfs

nodeNext, install it on each node of the cloud-native cluster nfs, so that nodethe node can drive nfsthe device:

# 在node上安装nfs服务,注意不需要启动
[root@master ~]# yum install nfs-utils -y

2. Create PVand PVCarrange files grafana-volume.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: grafana
spec:
  capacity:
    storage: 512Mi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    server: 192.168.52.151
    path: /data/k8s
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana
  namespace: kube-system
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 512Mi

3. View PVand PVC:


4. DeploymentController creation Grafana Pod:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: kube-system
  labels:
    app: grafana
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
      - name: grafana
        image: grafana/grafana
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
          name: grafana
        env:
        - name: GF_SECURITY_ADMIN_USER
          value: admin
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: admin
        readinessProbe:
          failureThreshold: 10
          httpGet:
            path: /api/health
            port: 3000
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 30
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /api/health
            port: 3000
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 100m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 256Mi
        volumeMounts:
        - mountPath: /var/lib/grafana
          subPath: grafana
          name: storage
      securityContext:
        fsGroup: 0
        runAsUser: 0
      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: grafana

Two more important environment variables GF_SECURITY_ADMIN_USERand GF_SECURITY_ADMIN_PASSWORDare used to grafanaconfigure the administrator user and password. Since the data grafanaof dashboardthe plug-in is stored /var/lib/grafanain this directory, if we need to do data persistence here, we need to volumehang it The loading statement, the rest Deploymentare no .

5. Create Service:

apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: kube-system
  labels:
    app: grafana
spec:
  type: NodePort
  ports:
    - port: 3000
  selector:
    app: grafana

6. Check Service:

[root@master k8s-demo]# kubectl get svc -n kube-system -owide
NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE     SELECTOR
grafana              NodePort    10.96.80.215   <none>        3000:30441/TCP           2m32s   app=grafana
kube-dns             ClusterIP   10.96.0.10     <none>        53/UDP,53/TCP,9153/TCP   16d     k8s-app=kube-dns
kube-state-metrics   ClusterIP   None           <none>        8080/TCP,8081/TCP        15d     app.kubernetes.io/name=kube-state-metrics
kuboard              NodePort    10.96.52.49    <none>        80:32567/TCP             15d     k8s.kuboard.cn/layer=monitor,k8s.kuboard.cn/name=kuboard
metrics-server       ClusterIP   10.96.162.56   <none>        443/TCP                  15d     k8s-app=metrics-server

Dashboard configuration

1. GrafanaAfter the deployment is complete, we enter the cluster in the browser 任意节点IP:30441and it will open Grafana UI. Use admin/adminthe login:

2. Create Prometheusa data source:


3. Import 8919 dashboard, and Kubernetesthe cloud-native cluster node performance monitoring indicators are displayed on the template, as shown in the following figure:


Guess you like

Origin blog.csdn.net/god_86/article/details/120008904