kubernetes Eco - delivery prometheus cool dashboard to monitor and grafana cluster k8s

Due to the special nature of docker container, traditional zabbix unable to monitor the status of docker within k8s cluster, so it is necessary to monitor the use of prometheus:

What is Prometheus?

Prometheus is developed SoundCloud open when monitoring alarm system and sequence database (TSDB). Prometheus use the Go language development, it is Google BorgMon monitoring system open source version.
2016 initiated by Google Linux Foundation, the Foundation's native cloud (Cloud Native Computing Foundation), the second Prometheus into its next big open source projects.
Prometheus is currently very active in the open source community.
Prometheus and Heapster (Heapster is a subproject K8S for obtaining performance data clusters.) Compared to function better and more comprehensive. Prometheus performance enough to support tens of thousands of cluster size.

Prometheus features

    • Multi-dimensional data model.
    • Flexible query language.
    • Not rely on distributed storage, a single server node is autonomous.
    • By way of the data capture timing pull HTTP based.
    • When sequence data push can be performed by an intermediate gateway.
    • Static configuration or through service discovery to find the target clients.
    • It supports a variety of charts and interface display, such as Grafana and so on.

 

 

 

Fundamental

Prometheus is a basic principle of the HTTP protocol crawl status periodically monitored component, any component as long as the corresponding HTTP interface can monitor access. Without any SDK or other integration process. This is very suitable for a virtualized environment monitoring system, such as VM, Docker, Kubernetes and so on. Output monitored component information HTTP interface called exporter. At present the Internet company most commonly used components are exporter can be used directly, such as Varnish, Haproxy, Nginx, MySQL, Linux system information (including disk, memory, CPU, network, etc.).

Service process

  • Prometheus Daemon is responsible for the timing crawl metrics (indicators) data on away goals, goals need to expose each grab an http service interface to its regular crawl. Prometheus supported through configuration files, text files, Zookeeper, Consul, DNS SRV Lookup etc. specified crawl target. Prometheus monitored using PULL manner, i.e. through the target server PULL data directly or indirectly through intermediate gateways to Push data.
  • Prometheus in all locally stored data to crawl, and clean up and organize data by certain rules, and the results obtained are stored in a new time series.
  • Prometheus impression data collected by PromQL API and other visually. Prometheus support many ways chart visualization, for example Grafana, comes Promdash and template engine and so on itself provides. Prometheus also provides the HTTP API query, customize the output required.
  • Client Support PushGateway active push metrics to PushGateway, but the timing to fetch the data Prometheus on Gateway.
  • Alertmanager is a component independent of Prometheus, Prometheus can support the query, provide a very flexible alarm mode.

Three Suite

  • Server is responsible for data collection and storage, providing support PromQL query language.
  • Alertmanager alert manager for alarm.
  • Push Gateway supports the initiative to push Temporary Job intermediate gateways indicators.

prometheus different from zabbix, no agent, is used for different services exporter:

prometheus official website: official website address

Under normal circumstances, the monitoring k8s cluster and node, pod, there are four common exporter:

  • kube-state-metrics - the cluster master & etcd collected k8s basic status information
  • node-exporter - collecting cluster node information k8s
  • cadvisor - Internal collection k8s cluster docker container to use information resources
  • blackbox-exporte - collect k8s cluster docker container service is alive

Then one by one to create more than exporter:

Old routines, download docker mirror, preparing a list of resources, application resource configuration list:

一、kube-state-metrics

# docker pull quay.io/coreos/kube-state-metrics:v1.5.0
# docker tag 91599517197a harbor.od.com/public/kube-state-metrics:v1.5.0
# docker push harbor.od.com/public/kube-state-metrics:v1.5.0

Preparing a list of resources:

1, rbac.yaml

# mkdir /data/k8s-yaml/kube-state-metrics && cd /data/k8s-yaml/kube-state-metrics
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: kube-state-metrics
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: kube-state-metrics
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  - nodes
  - pods
  - services
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  verbs:
  - list
  - watch
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - list
  - watch
- apiGroups:
  - extensions
  resources:
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - watch
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
- kind: ServiceAccount
  name: kube-state-metrics
  namespace: kube-system

2、dp.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "2"
  labels:
    grafanak8sapp: "true"
    app: kube-state-metrics
  name: kube-state-metrics
  namespace: kube-system
spec:
  selector:
    matchLabels:
      grafanak8sapp: "true"
      app: kube-state-metrics
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        grafanak8sapp: "true"
        app: kube-state-metrics
    spec:
      containers:
      - name: kube-state-metrics
        image: harbor.od.com/public/kube-state-metrics:v1.5.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http-metrics
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
      serviceAccountName: kube-state-metrics

Application resource configuration list:

# kubectl apply -f http://k8s-yaml.od.com/kube-state-metrics/rbac.yaml
# kubectl apply -f http://k8s-yaml.od.com/kube-state-metrics/dp.yaml

have a test:

# kubectl get pod -n kube-system -o wide

 

 

 

# curl http://172.7.22.10:8080/healthz

 

 

 已经成功运行。

二、node-exporter

由于node-exporter是监控node的,所有需要每个节点启动一个,所以使用ds控制器

 

# docker pull prom/node-exporter:v0.15.0

 

# docker tag 12d51ffa2b22 harbor.od.com/public/node-exporter:v0.15.0
# docker push harbor.od.com/public/node-exporter:v0.15.0

准备资源配置清单:

1、ds.yaml

# mkdir node-exporter && cd node-exporter
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    daemon: "node-exporter"
    grafanak8sapp: "true"
spec:
  selector:
    matchLabels:
      daemon: "node-exporter"
      grafanak8sapp: "true"
  template:
    metadata:
      name: node-exporter
      labels:
        daemon: "node-exporter"
        grafanak8sapp: "true"
    spec:
      volumes:
      - name: proc
        hostPath: 
          path: /proc
          type: ""
      - name: sys
        hostPath:
          path: /sys
          type: ""
      containers:
      - name: node-exporter
        image: harbor.od.com/public/node-exporter:v0.15.0
        imagePullPolicy: IfNotPresent
        args:
        - --path.procfs=/host_proc
        - --path.sysfs=/host_sys
        ports:
        - name: node-exporter
          hostPort: 9100
          containerPort: 9100
          protocol: TCP
        volumeMounts:
        - name: sys
          readOnly: true
          mountPath: /host_sys
        - name: proc
          readOnly: true
          mountPath: /host_proc
      hostNetwork: true

应用资源配置清单:

# kubectl apply -f http://k8s-yaml.od.com/node-exporter/ds.yaml
# kubectl get pod -n kube-system -o wide

我们有两个node,每个node节点启动一个:

 

 

 三、cadvisor

# docker pull google/cadvisor:v0.28.3
# docker tag 75f88e3ec333 harbor.od.com/public/cadvisor:0.28.3
# docker push harbor.od.com/public/cadvisor:0.28.3

准备资源配置清单:

# mkdir cadvisor && cd cadvisor

1、ds.yaml  标红部分是k8s资源配置清单中一个重要的高级属性,下一篇博客着重介绍

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cadvisor
  namespace: kube-system
  labels:
    app: cadvisor
spec:
  selector:
    matchLabels:
      name: cadvisor
  template:
    metadata:
      labels:
        name: cadvisor
    spec:
      hostNetwork: true
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: cadvisor
        image: harbor.od.com/public/cadvisor:v0.28.3
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: rootfs
          mountPath: /rootfs
          readOnly: true
        - name: var-run
          mountPath: /var/run
        - name: sys
          mountPath: /sys
          readOnly: true
        - name: docker
          mountPath: /var/lib/docker
          readOnly: true
        ports:
          - name: http
            containerPort: 4194
            protocol: TCP
        readinessProbe:
          tcpSocket:
            port: 4194
          initialDelaySeconds: 5
          periodSeconds: 10
        args:
          - --housekeeping_interval=10s
          - --port=4194
      terminationGracePeriodSeconds: 30
      volumes:
      - name: rootfs
        hostPath:
          path: /
      - name: var-run
        hostPath:
          path: /var/run
      - name: sys
        hostPath:
          path: /sys
      - name: docker
        hostPath:
          path: /data/docker

针对挂载资源,做一些调整:

# mount -o remount,rw /sys/fs/cgroup/
# ln -s /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/cpuacct,cpu

应用资源配置清单:

# kubectl apply -f http://k8s-yaml.od.com/cadvisor/ds.yaml

检查:

 

 

 

四、blackbox-exporter

# docker pull prom/blackbox-exporter:v0.15.1
# docker tag 81b70b6158be  harbor.od.com/public/blackbox-exporter:v0.15.1
# docker push harbor.od.com/public/blackbox-exporter:v0.15.1

创建资源配置清单:

1、cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: blackbox-exporter
  name: blackbox-exporter
  namespace: kube-system
data:
  blackbox.yml: |-
    modules:
      http_2xx:
        prober: http
        timeout: 2s
        http:
          valid_http_versions: ["HTTP/1.1", "HTTP/2"]
          valid_status_codes: [200,301,302]
          method: GET
          preferred_ip_protocol: "ip4"
      tcp_connect:
        prober: tcp
        timeout: 2s

2、dp.yaml

kind: Deployment
apiVersion: extensions/v1beta1
metadata:
  name: blackbox-exporter
  namespace: kube-system
  labels:
    app: blackbox-exporter
  annotations:
    deployment.kubernetes.io/revision: 1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: blackbox-exporter
  template:
    metadata:
      labels:
        app: blackbox-exporter
    spec:
      volumes:
      - name: config
        configMap:
          name: blackbox-exporter
          defaultMode: 420
      containers:
      - name: blackbox-exporter
        image: harbor.od.com/public/blackbox-exporter:v0.15.1
        imagePullPolicy: IfNotPresent
        args:
        - --config.file=/etc/blackbox_exporter/blackbox.yml
        - --log.level=info
        - --web.listen-address=:9115
        ports:
        - name: blackbox-port
          containerPort: 9115
          protocol: TCP
        resources:
          limits:
            cpu: 200m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 50Mi
        volumeMounts:
        - name: config
          mountPath: /etc/blackbox_exporter
        readinessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3

3、svc.yaml

kind: Service
apiVersion: v1
metadata:
  name: blackbox-exporter
  namespace: kube-system
spec:
  selector:
    app: blackbox-exporter
  ports:
    - name: blackbox-port
      protocol: TCP
      port: 9115

4、ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: blackbox-exporter
  namespace: kube-system
spec:
  rules:
  - host: blackbox.od.com
    http:
      paths:
      - path: /
        backend:
          serviceName: blackbox-exporter
          servicePort: blackbox-port

这里用到了一个域名,添加解析:

# vi /var/named/od.com.zone
blackbox       A    10.4.7.10

应用资源配置清单:

# kubectl apply -f http://k8s-yaml.od.com/blackbox-exporter/cm.yaml
# kubectl apply -f http://k8s-yaml.od.com/blackbox-exporter/dp.yaml
# kubectl apply -f http://k8s-yaml.od.com/blackbox-exporter/svc.yaml
# kubectl apply -f http://k8s-yaml.od.com/blackbox-exporter/ingress.yaml

 

访问域名测试:

访问到以下界面,表示blackbox已经运行成功

 

 接下来部署prometheus server:

# docker pull prom/prometheus:v2.14.0
# docker tag 7317640d555e harbor.od.com/infra/prometheus:v2.14.0
# docker push harbor.od.com/infra/prometheus:v2.14.0

准备资源配置清单:

1、rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: prometheus
  namespace: infra
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: infra

2、dp.yaml

加上--web.enable-lifecycle启用远程热加载配置文件
调用指令是curl -X POST http://localhost:9090/-/reload

storage.tsdb.min-block-duration=10m #只加载10分钟数据到内

storage.tsdb.retention=72h #保留72小时数据

 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "5"
  labels:
    name: prometheus
  name: prometheus
  namespace: infra
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 7
  selector:
    matchLabels:
      app: prometheus
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: harbor.od.com/infra/prometheus:v2.14.0
        imagePullPolicy: IfNotPresent
        command:
        - /bin/prometheus
        args:
        - --config.file=/data/etc/prometheus.yml
        - --storage.tsdb.path=/data/prom-db
        - --storage.tsdb.min-block-duration=10m
        - --storage.tsdb.retention=72h
        - --web.enable-lifecycle
        ports:
        - containerPort: 9090
          protocol: TCP
        volumeMounts:
        - mountPath: /data
          name: data
        resources:
          requests:
            cpu: "1000m"
            memory: "1.5Gi"
          limits:
            cpu: "2000m"
            memory: "3Gi"
      imagePullSecrets:
      - name: harbor
      securityContext:
        runAsUser: 0
      serviceAccountName: prometheus
      volumes:
      - name: data
        nfs:
          server: hdss7-200
          path: /data/nfs-volume/prometheus

3、svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: infra
spec:
  ports:
  - port: 9090
    protocol: TCP
    targetPort: 9090
  selector:
    app: prometheus

4、ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: traefik
  name: prometheus
  namespace: infra
spec:
  rules:
  - host: prometheus.od.com
    http:
      paths:
      - path: /
        backend:
          serviceName: prometheus
          servicePort: 9090

这里用到一个域名,添加解析:

prometheus         A    10.4.7.10

记得重启named服务

 

创建需要的目录:

# mkdir -p /data/nfs-volume/prometheus/{etc,prom-db}

修改prometheus配置文件:别问为啥这么写,问就是不懂~

# vi /data/nfs-volume/prometheus/etc/prometheus.yml
global:
  scrape_interval:     15s
  evaluation_interval: 15s
scrape_configs:
- job_name: 'etcd'
  tls_config:
    ca_file: /data/etc/ca.pem
    cert_file: /data/etc/client.pem
    key_file: /data/etc/client-key.pem
  scheme: https
  static_configs:
  - targets:
    - '10.4.7.12:2379'
    - '10.4.7.21:2379'
    - '10.4.7.22:2379'
- job_name: 'kubernetes-apiservers'
  kubernetes_sd_configs:
  - role: endpoints
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
    action: keep
    regex: default;kubernetes;https
- job_name: 'kubernetes-pods'
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
- job_name: 'kubernetes-kubelet'
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __address__
    replacement: ${1}:10255
- job_name: 'kubernetes-cadvisor'
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __address__
    replacement: ${1}:4194
- job_name: 'kubernetes-kube-state'
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
  - source_labels: [__meta_kubernetes_pod_label_grafanak8sapp]
    regex: .*true.*
    action: keep
  - source_labels: ['__meta_kubernetes_pod_label_daemon', '__meta_kubernetes_pod_node_name']
    regex: 'node-exporter;(.*)'
    action: replace
    target_label: nodename
- job_name: 'blackbox_http_pod_probe'
  metrics_path: /probe
  kubernetes_sd_configs:
  - role: pod
  params:
    module: [http_2xx]
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_blackbox_scheme]
    action: keep
    regex: http
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_blackbox_port,  __meta_kubernetes_pod_annotation_blackbox_path]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+);(.+)
    replacement: $1:$2$3
    target_label: __param_target
  - action: replace
    target_label: __address__
    replacement: blackbox-exporter.kube-system:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
- job_name: 'blackbox_tcp_pod_probe'
  metrics_path: /probe
  kubernetes_sd_configs:
  - role: pod
  params:
    module: [tcp_connect]
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_blackbox_scheme]
    action: keep
    regex: tcp
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_blackbox_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __param_target
  - action: replace
    target_label: __address__
    replacement: blackbox-exporter.kube-system:9115
  - source_labels: [__param_target]
    target_label: instance
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name
- job_name: 'traefik'
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    action: keep
    regex: traefik
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name

 

拷贝配置文件中用到的证书:

# cd /data/nfs-volume/prometheus/etc/
# cp /opt/certs/ca.pem ./ # cp /opt/certs/client.pem ./ # cp /opt/certs/client-key.pem ./

应用资源配置清单:

# kubectl apply -f http://k8s-yaml.od.com/prometheus-server/rbac.yaml
# kubectl apply -f http://k8s-yaml.od.com/prometheus-server/dp.yaml
# kubectl apply -f http://k8s-yaml.od.com/prometheus-server/svc.yaml
# kubectl apply -f http://k8s-yaml.od.com/prometheus-server/ingress.yaml

 

 

浏览器验证:prometheus.od.com

这里点击status-targets,这里展示的就是我们在prometheus.yml中配置的job-name,这些targets基本可以满足我们收集数据的需求。

 

点击status-configuration就是我们的配置文件

 

 

 我们在配置文件中,除了etcd使用的静态配置以外,其他job都是使用的自动发现。

 静态配置:

global:
  scrape_interval:     15s
  evaluation_interval: 15s
scrape_configs:
- job_name: 'etcd'
  tls_config:
    ca_file: /data/etc/ca.pem
    cert_file: /data/etc/client.pem
    key_file: /data/etc/client-key.pem
  scheme: https
  static_configs:
  - targets:
    - '10.4.7.12:2379'
    - '10.4.7.21:2379'
    - '10.4.7.22:2379'

自动发现:自动发现资源是pod

- job_name: 'blackbox_http_pod_probe'
  metrics_path: /probe
  kubernetes_sd_configs:
  - role: pod
  params:
    module: [http_2xx]
  relabel_configs:

 

Guess you like

Origin www.cnblogs.com/slim-liu/p/12056414.html