Promethues (h) Monitoring kubernetes

prometheus related services are deployed in monitoring this namespace

Service deployment prometheus

  1. Create a namespace

    $ cd /opt/k8s/prometheus
    $ cat>1-namespace.yml<<EOF
    apiVersion: v1
    kind: Namespace
    metadata:
      name: monitoring
    EOF
    
    
  2. Creating prometheus corresponding configuration file, use kubernetes of ConfigMap

    $ cd /opt/k8s/prometheus 
    $ cat>2-prom-cnfig.yml<<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: prom-config
      namespace: monitoring
    data:
      prometheus.yml: |
        global:
          scrape_interval: 15s
          scrape_timeout: 15s
        scrape_configs:
        - job_name: 'prometheus'
          static_configs:
          - targets: ['localhost:9090']
    EOF
    
    
  3. Pv prometheus created to store data, PVC (using local storage can be replaced by building a distributed file system NFS, GlusterFs etc.)

    $ cd /opt/k8s/prometheus
    $ cat>3-prom-pv.yml<<EOF
    kind: PersistentVolume
    apiVersion: v1
    metadata:
      namespace: monitoring
      name: prometheus
      labels:
        type: local
        app: prometheus
    spec:
      capacity:
        storage: 10Gi
      accessModes:
        - ReadWriteOnce
      hostPath:
        path: /opt/k8s/prometheus/data
    ---
    
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      namespace: monitoring
      name: prometheus-claim
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi 
    EOF
    
    
  4. Prometheus create a boot file to deploy form of deployment, external access by type of service NodePort

    $ cd /opt/k8s/prometheus
    $ cat>4-prometheus.yml<<EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: prometheus
      namespace: monitoring
      labels:
        app: prometheus
    spec:
      selector:
        matchLabels:
          app: prometheus
      replicas: 1
      template:
        metadata:
          labels:
            app: prometheus
        spec:
          containers:
          - name: prometheus
            image: prom/prometheus:v2.16.0
            args:
            - '--config.file=/etc/prometheus/prometheus.yml'
            - '--storage.tsdb.path=/prometheus'
            - "--storage.tsdb.retention=7d"
            - "--web.enable-lifecycle"
            ports:
            - containerPort: 9090
            volumeMounts:
            - mountPath: "/prometheus"
              subPath: prometheus
              name: data
            - mountPath: "/etc/prometheus"
              name: config
            resources:
              requests:
                cpu: 500m
                memory: 2Gi
              limits:
                cpu: 500m
                memory: 2Gi
          volumes:
          - name: config
            configMap:
              name: prom-config
          - name: data
            persistentVolumeClaim:
              claimName: prometheus-claim
    	     
    ---
    apiVersion: v1
    kind: Service
    metadata:
      namespace: monitoring
      name: prometheus
    spec:
      type: NodePort
      ports:
        - port: 9090
          targetPort: 9090
          nodePort: 9090
      selector:
        app: prometheus
       
    EOF
    
    
    • In Prometheus start command, passing parameters storage.tsdb.path, storage.tsdb.retention Prometheus specify the path data storage, storage time
    • After web.enable-lifecycle configuration information when changes by / - / reload to reload the new configuration content, without having to restart the service
  5. Start Prometheus Services

    $ cd /opt/k8s/prometheus
    $ mkdir data & chmod -R 777 data
    $ kubectl create -f 1-namespace.yml -f 2-prom-cnfig.yml -f 3-prom-pv.yml -f 4-prometheus.yml
    
    
  6. View component status, to ensure that all services start properly

    $ kubectl get all -n monitoring
    NAME                              READY   STATUS    RESTARTS   AGE
    pod/prometheus-57cf64764d-xqnvl   1/1     Running   0          51s
    
    NAME                 TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
    service/prometheus   NodePort   10.254.209.164   <none>        9090:9090/TCP   51s
    
    NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/prometheus   1/1     1            1           51s
    
    NAME                                    DESIRED   CURRENT   READY   AGE
    replicaset.apps/prometheus-57cf64764d   1         1         1       51s
    
    
  7. Interface Access

Kubernetes monitoring and related technologies

  • cAdvisor Google open source a vessel monitoring program, collecting container itself using a variety of resources, performance-related information provided to the external form of resutful API's. kubernetes the cAdvisor integrated into the function corresponding to the kubelet, so there are no deployed separately, directly from kubelet, and further container kubelet Summary information, in units of pod for external calls.
  • metrics-server is a core component kubernetes monitoring process is Heapster alternative, to obtain information from the interface kubelet indicators, mainly CPU, Memory, and then is exposed to by the API server. Mainly for kubectl top, HPA and other kubernetes components use. Only in memory to store the last acquired index information, is not responsible for data storage
  • kube-state-metrics by listening to the API Server generates about resource objects of state indicators, such as Deployment, replica sets and so on. Only in memory to store the last acquired index information, is not responsible for data storage
  • node-exporter index information, designed to provide an official collection Prometheus * NIX system itself, as well as the corresponding hardware
  • kube-prometheus stop kubernetes monitoring program, the node-exporter, prometheus, kube- state-metrics, Grafana, metrics-server and other components collected, providing a more convenient script for the user to quickly build a complete monitoring platform .

kubernetes monitor the content

  • Monitoring of the state of the cluster itself, such as the node's own CPU, Memory, IO, Network and other information
  • Monitor systems such as self-assembly kubernetes kube-schedule-manager, kube-proxy, kubelet etc.
  • Cluster operation monitoring vessel, container, Pod as the CPU unit, Memory information
  • Monitoring cluster index corresponding orchestration component, such as Deployment, Daemonset etc.

In this paper, its own form of deploying components, see step by step how to set up a monitoring platform. There needs to quickly build, reference may be kube-prometheus

Deployment of node-exporter

Because each node to be monitored, so that the controller uses to deploy Daemonset node-exporter, run on each node in a Pod

  1. Startup file

    $ cd /opt/k8s/prometheus
    $ cat>5-node-exporter.yml<<EOF
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      labels:
        app: node-exporter
      name: node-exporter
      namespace: monitoring
    spec:
      selector:
        matchLabels:
          app: node-exporter
      template:
        metadata:
          labels:
            app: node-exporter
        spec:
          containers:
          - name: node-exporter
            image: 192.168.0.107/prometheus/node-exporter:v0.18.1
            args:
            - --web.listen-address=:9100
            - --path.procfs=/host/proc
            - --path.sysfs=/host/sys
            - --path.sysfs=/host/sys
            - --path.rootfs=/host/root
            - --no-collector.hwmon
            - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
            - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
            resources:
              limits:
                cpu: 250m
                memory: 180Mi
              requests:
                cpu: 102m
                memory: 180Mi
            ports:
            - containerPort: 9100
            volumeMounts:
            - mountPath: /host/proc
              name: proc
              readOnly: false
            - mountPath: /host/sys
              name: sys
              readOnly: false
            - mountPath: /host/root
              mountPropagation: HostToContainer
              name: root
              readOnly: true
          hostNetwork: true
          hostPID: true
          nodeSelector:
            kubernetes.io/os: linux
          securityContext:
            runAsNonRoot: true
            runAsUser: 65534
          
          tolerations:
          - operator: Exists
          volumes:
          - hostPath:
              path: /proc
            name: proc
          - hostPath:
              path: /sys
            name: sys
          - hostPath:
              path: /
            name: root
      
    EOF
    
    
  2. Start node-exporter

    $ cd /opt/k8s/prometheus
    $ kubectl create -f 5-node-exporter.yml 
    $ kubectl -n monitoring get pod | grep node
    node-exporter-854vr           1/1     Running   6          50m
    node-exporter-lv9pv           1/1     Running   0          50m
    
    
  3. Collecting indicator information node-exporter by prometheus

    Because dynamic expansion and may reduce after-node cluster, so the inconvenience in the form of static configuration, Prometheus provides us Kubernetes corresponding service discovery feature enables dynamic monitoring of the Kubernetes. Wherein the monitoring service node by node discovery mode, additionally follows prometheus configuration file (corresponding to the 2-prom-cnfig.yml also needs to be added, otherwise the reconstruction Configmap information would be lost)

    $ kubectl -n monitoring edit configmaps prom-config
    
    
    - job_name: "kubernetes-nodes"
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - source_labels: [__address__]
          regex: '(.*):10250'
          replacement: '${1}:9100'
          target_label: __address__
          action: replace
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
    
    

    After the completion of additional execute the following command to reload the configuration item, as to why such a configuration, the rear section configuration principle to explain the specific

    $ curl -XPOST http://192.168.0.107:9090/-/reload
    
    

    At this point, prometheus cluster node will attempt to obtain information, see promethes log information, you will find the following error message

    level=error ts=2020-03-22T10:37:13.856Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:333: Failed to list *v1.Node: nodes is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope"
    
    

    Means that the list can not use the default serviceaccount * v1.Node, therefore we need to re-create serviceaccount to Prometheus, and given the appropriate permissions

  4. Creating prometheus corresponding serviceaccount, and given the appropriate permissions

    $ cd /opt/k8s/prometheus
    $ cat>6-prometheus-serivceaccount-role.yaml<<EOF
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: prometheus-k8s
      namespace: monitoring
    
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: prometheus-k8s
    rules:
    - apiGroups: [""]
      resources:
      - nodes/proxy
      - nodes
      - namespaces
      - endpoints
      - pods
      - services
      verbs: ["get","list","watch"]
    - apiGroups: [""]
      resources:
      - nodes/metrics
      verbs: ["get"]
    - nonResourceURLs:
      - /metrics
      verbs: ["get"]
    - apiGroups:
      - extensions
      resources:
      - ingresses
      verbs: ["get", "list", "watch"]
    
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: prometheus-k8s
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: prometheus-k8s
    subjects:
    - kind: ServiceAccount
      name: prometheus-k8s
      namespace: monitoring
    
    EOF
    
    
    $ cd /opt/k8s/prometheus
    $ kubectl create -f 6-prometheus-serivceaccount-role.yaml
     
    

    Prometheus modify the start yaml, additional serivceaccount configuration, reboot Prometheus

    ...
    spec:
      serviceAccountName: prometheus-k8s
      containers:
      - name: prometheus                 
    ...
    
    

    View prometheus monitoring object list

Detailed configuration principle kubernetes_sd_config

  1. Find a destination address

    When the role kubernetes_sd_config node configuration is, after prometheus will start the LIST Node API calls kubernetes acquired node information, and obtain IP and PORT from node object to constitute a monitoring address.

    Wherein obtaining IP lookup in the following order InternalIP, ExternalIP, LegacyHostIP, HostName

    Port value defaults Kubelet the HTTP port.

    IP and Port can view information list Node Interface returned the following command

    $ kubectl get node -o=jsonpath='{range .items[*]}{.status.addresses}{"\t"}{.status.daemonEndpoints}{"\n"}{end}'
    [map[address:192.168.0.107 type:InternalIP] map[address:master type:Hostname]]  map[kubeletEndpoint:map[Port:10250]]
    [map[address:192.168.0.114 type:InternalIP] map[address:slave type:Hostname]]   map[kubeletEndpoint:map[Port:10250]]
    
    

    The return result in the cluster is two nodes, taget constituting respectively the address

    192.168.0.107:10250
    192.168.0.114:10250
    
    
  2. relabe_configs

    Relabeling can make dynamic changes to the value of the label of Prometheus before crawling data. Prometheus has a number of default tags, and several of us dealing with the following

    • __address__: Will be set as the target address corresponding to the initialization<host>:<port>
    • instance: __address__Value of the tag after Relabel stage will be set to the tag instance, which instanceis the __address__value of the label after Relabel
    • __scheme__ Default: http
    • __metrics_path__ Defaults / metrics

    Prometheus pull index of the destination address information is put a few labels to connect__scheme__://instance/__metrics_path__

  3. We start node-expeorter, in each node :9100/metricson the exposed node index information, and then added to a configuration in segment prometheus

    - job_name: "kubernetes-nodes"
        kubernetes_sd_configs:
        - role: node
        relabel_configs:
        - source_labels: [__address__]
          regex: '(.*):10250'
          replacement: '${1}:9100'
          target_label: __address__
          action: replace
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
          
    

    Wherein the first profile segment relabel_configs

      - source_labels: [__address__]
         regex: '(.*):10250'
         replacement: '${1}:9100'
         target_label: __address__
         action: replace
    
    
    • From regex by __address__matching the IP address
    • replacement: set to a value corresponding to ${IP}:9100
    • target_label: The __address__values replaced with replacement, i.e.,${IP}:9100

    After these steps, get spliced into the index address [http://192.168.0.107:9100/metrics, http://192.168.0.114:9100/metrics], and address our index node-exporter exposed match, you can pull the index information of the Node

    Moreover, because the node corresponding to the tag prometheus will become __meta_kubernetes_node_label_<labelname>, so the operation of a labelmap added, the names of these tags then restore them

  4. Full configuration example of a reference prometheus-kubernetes

Additional indicators collected kubelete provided

kubelet collect certain information indicators api server, etcd and other services, can be viewed with the following command

$ kubectl get --raw https://192.168.0.107:10250/metrics 

  • Where 10250 is the default listening port of kubelet

Followed by an additional configuration in promehteus, so that promehteus pull this information

- job_name: "kubernetes-kubelet"
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)

Additional collection cAdvisor indicators to achieve control of the cluster container

kubelet integrated cAdvisor default, to the cluster information collection container, Kubernetes 1.7.3 version and thereafter, the index information collected cAdvisor (beginning container_) from the corresponding Kubelet / metrics removed, so require additional configuration of a job cAdvisor collection. cAdvisor indicators invoke the command

$ kubectl get --raw https://192.168.0.107:6443/api/v1/nodes/master/proxy/metrics/cadvisor 

  • Where 10250 is the default listening port of kubelet

Followed by an additional configuration in promehteus, so that promehteus pull this information

- job_name: "kubernetes-cadvisor"
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - target_label: __address__
    replacement: kubernetes.default.svc:443
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
         

Re-load the configuration file after the addition is complete, view the list of objects to monitor prometheus

Full configuration file follows

$ cd /opt/k8s/prometheus
$ cat 2-prom-cnfig.yml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prom-config
  namespace: monitoring
data:
level=info ts=2020-03-22T12:15:02.551Z caller=head.go:625 component=tsdb msg="WAL segment loaded" segment=10 maxSegment=15
  prometheus.yml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 15s
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
      - targets: ['localhost:9090']
    - job_name: "kubernetes-nodes"
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:9100'
        target_label: __address__
        action: replace
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
    - job_name: "kubernetes-kubelet"
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
    - job_name: "kubernetes-cadvisor"
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

Department grafana

  1. Create pv grafana for storing data, PVC (using local storage can be replaced by building a distributed file system NFS, GlusterFs etc.)

    $ cd /opt/k8s/prometheus
    $ cat>7-grafana-pv.yml<<EOF
    kind: PersistentVolume
    apiVersion: v1
    metadata:
      namespace: monitoring
      name: grafana
      labels:
        type: local
        app: grafana
    spec:
      capacity:
        storage: 10Gi
      accessModes:
        - ReadWriteOnce
      hostPath:
        path: /opt/k8s/prometheus/grafana-pvc
    ---
    
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
      namespace: monitoring
      name: grafana-claim
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi 
    EOF
    
    
    
  2. grafana deployment file

    $ cd /opt/k8s/prometheus
    $ cat>8-grafana.yml<<EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: grafana
      name: grafana
      namespace: monitoring
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: grafana
      template:
        metadata:
          labels:
            app: grafana
        spec:
          containers:
          - image: grafana/grafana:6.6.2
            name: grafana
            ports:
            - containerPort: 3000
              name: http
            readinessProbe:
              httpGet:
                path: /api/health
                port: http
            resources:
              limits:
                cpu: 200m
                memory: 400Mi
              requests:
                cpu: 100m
                memory: 200Mi
            volumeMounts:
            - mountPath: /var/lib/grafana
              name: grafana-pvc
              readOnly: false
              subPath: data
            - mountPath: /etc/grafana/provisioning/datasources
              name: grafana-pvc
              readOnly: false
              subPath: datasources
            - mountPath: /etc/grafana/provisioning/dashboards
              name: grafana-pvc
              readOnly: false
              subPath: dashboards-pro
            - mountPath: /grafana-dashboard-definitions/0
              name: grafana-pvc
              readOnly: false
              subPath: dashboards
          nodeSelector:
            beta.kubernetes.io/os: linux
          securityContext:
            runAsNonRoot: true
            runAsUser: 65534
          volumes:
          - name: grafana-pvc
            persistentVolumeClaim:
              claimName: grafana-claim
    
    ---
    apiVersion: v1
    kind: Service
    metadata:
      namespace: monitoring
      name: grafana
    spec:
      type: NodePort
      ports:
        - port: 3000
          targetPort: 3000
          nodePort: 3000
      selector:
        app: grafana
      
    EOF
    
    
  3. Start grafana

    1. Create a directory to mount grafana

      $ cd /opt/k8s/prometheus
      $ mkdir -p grafana-pvc/data
      $ mkdir -p grafana-pvc/datasources
      $ mkdir -p grafana-pvc/dashboards-pro
      $ mkdir -p grafana-pvc/dashboards
      
      $ chmod -R 777 grafana-pvc
      
      
      • data directory data stored grafana
      • datasources storing predefined data source
      • dashboards-pro management file stored dashboards, wherein the configuration file address points to the dashboard of grafana-pvc / dashboards container mounted to the address / grafana-dashboard-definitions / 0
      • dashboards storage real dashboards definition file (json)
    2. Create a default data source file

      $ cd /opt/k8s/prometheus/grafana-pvc/datasources
      $ cat > datasource.yaml<<EOF
      apiVersion: 1
      datasources:
      - name: Prometheus
        type: prometheus
        access: proxy
        url: http://prometheus.monitoring.svc:9090
      
      EOF
      
      
    3. Create a default file management dashboards

      $ cd /opt/k8s/prometheus/grafana-pvc/dashboards-pro
      $ cat >dashboards.yaml<<EOF
      apiVersion: 1
      providers:
      - name: '0'
        orgId: 1
        folder: ''
        type: file
        editable: true
        updateIntervalSeconds: 10
        allowUiUpdates: false
        options:
          path: /grafana-dashboard-definitions/0
      EOF
      
      
    4. Create a default dashboard definition file

      Be the A Collection of Shared Dashboards , find dashboard template they need, download the corresponding json file corresponding to the file stored in / opt / k8s / prometheus / grafana -pvc / dashboards), here by way of example, downloaded 1 Node Exporter for Prometheus Dashboard v20191102 the CN , the corresponding ID is 8919.

      $ cd /opt/k8s/prometheus/grafana-pvc/dashboards
      $ wget https://grafana.com/api/dashboards/8919/revisions/11/download -o node-exporter-k8s.json
      
      

      Because the default template data sources is used ${DS_PROMETHEUS_111}, there is an alternate configuration item when imported from the interface, we json Download files, by directly modifying the file, the data source into our /opt/k8s/prometheus/grafana-pvc/datasourcesconfiguration in a data source

      $ cd /opt/k8s/prometheus/grafana-pvc/dashboards
      $ sed -i "s/\${DS_PROMETHEUS_111}/Prometheus/g" node-exporter-k8s.json
      
      

      Modify title

      ...
      "timezone": "browser",
      "title": "k8s-node-monitoring", 
      
           ...
      
      
    5. start up

      $ cd /opt/k8s/prometheus/
      $ kubectl create -f 7-grafana-pv.yml 8-grafana.yml 
      
      
  4. By interface to view, because we have a data source, dashboard and other information over the default settings, you can directly view the corresponding dashboard

When deploying grafana, we configured the default data source, dashboard and other information, mainly in order to achieve these default monitoring indicators can be directly observed after system deployment, implementation does not require onsite configuration.
Other monitoring using e.g. Kube-state-metrics and cAdvisor metrics achieved cluster Deployment, StatefulSet, container, pod monitoring may also be implemented in this form. As can use 1. Kubernetes Deployment Statefulset Daemonset metrics as a template, slightly modified to meet the needs of our monitoring, there will no longer show the specific steps, readers can try on their own.

Guess you like

Origin www.cnblogs.com/gaofeng-henu/p/12555116.html