Prometheus Operator monitoring Kubernetes

Prometheus Operator monitoring Kubernetes

1.  Prometheus basic architecture

Prometheus is a complete open source monitoring solution, covering data collection, query, alarm, show the whole monitoring process, below is Prometheus architecture diagram :

 

The official document: https://prometheus.io/docs/introduction/overview/

 

2.  Component Description

Prometheus ecosystem of a plurality of components. Many of these components are optional

  • Promethus  server

Must be installed, it is essentially a sequence database, is responsible for data pull , storage, analysis, provide PromQL support query language;

  • Push Gateway

Non-Required to support temporary Job active push index of intermediate gateway

  • exporters

Deployed in the client's agent, such as node_exporte, mysql_exporter etc.

Information provided by the monitoring component of the HTTP interface is called the exporter , the current Internet company most commonly used components are exporter can be used directly, such as Varnish , Haproxy , Nginx , MySQL , Linux system information ( including disk, memory, the CPU , network, etc. ) ; as: https://prometheus.io/docs/instrumenting/exporters/

  • alertmanager

Used for alarm, Promethus Server through analysis, send an alert to the departure alertmanager components, alertmanager components through its own rules to send notification , ( e-mail, or webhook)

3. Prometheus-Operator

Prometheus-Operator architecture diagram:

 

The figure is Prometheus-Operator architecture diagram provided by the official, which the Operator is the core part, as a controller, he would go to create Prometheus , a ServiceMonitor , AlertManager and PrometheusRule4 a CRD resource object, and then constantly monitors and maintains these 4 months state resource object.

Which created prometheus this resource object is as Prometheus Server exist, and ServiceMonitor is the exporter of various abstract, the exporter We have already learned, is used to provide specializes in providing metrics tool data interface, Prometheus is through ServiceMonitor provided metrics data Interface to pull data, of course alertmanager such resources corresponding to objects is AlertManager abstract, and PrometheusRule are made to be Prometheus alarm rules file using instances.

So we have to monitor what data in the cluster, it becomes directly to the operating Kubernetes resource object cluster, is not it a lot easier. Figure above Service and ServiceMonitor are Kubernetes resources, a ServiceMonitor by labelSelector to match the way a class of Service , Prometheus can also labelSelector to match a plurality ServiceMonitor .

4.  Prometheus the Operator- deployment

The official chart Address: https://github.com/helm/charts/tree/master/stable/prometheus-operator

Search for the latest package downloaded to the local

# search for

helm search prometheus-operator

NAME                            CHART VERSION   APP VERSION     DESCRIPTION                                
stable/prometheus-operator      6.4.0           0.31.0          Provides easy monitoring definitions for Kubernetes servi...

# Pull to local

helm fetch prometheus-operator

installation

# Create a monitoring of namespaces

Kubectl create ns monitoring

# Installation

helm install -f ./prometheus-operator/values.yaml --name prometheus-operator --namespace=monitoring ./prometheus-operator

# Update

helm upgrade -f prometheus-operator/values.yaml prometheus-operator ./prometheus-operator

Uninstall prometheus-operator

helm delete prometheus-operator --purge

# Delete crd

kubectl delete customresourcedefinitions prometheuses.monitoring.coreos.com prometheusrules.monitoring.coreos.com servicemonitors.monitoring.coreos.com
kubectl delete customresourcedefinitions alertmanagers.monitoring.coreos.com
kubectl delete customresourcedefinitions podmonitors.monitoring.coreos.com

Modify the configuration file values.yaml

4.1.  E-mail alerts

  config:

    global:

      resolve_timeout: 5m

      smtp_smarthost: 'smtp.qq.com:465'

      smtp_from: '[email protected]'

      smtp_auth_username: '[email protected]'

      smtp_auth_password: 'xreqcqffrxtnieff'

      smtp_hello: '163.com'

      smtp_require_tls: false

    route:

      group_by: ['job','severity']

      group_wait: 30s

      group_interval: 1m

      repeat_interval: 12h

      receiver: default

      routes:

      - receiver: webhook

        match:

          alertname: TargetDown

    receivers:

    - name: default

      email_configs:

      - to: '[email protected]'

        send_resolved: true

    - name: webhook

      email_configs:

      - to: '[email protected]'

        send_resolved: true

Here is a pit refer to: https://www.cnblogs.com/Dev0ps/p/11320177.html

4.2.  Prometheus persistent storage

  storage:

      volumeClaimTemplate:

        spec:

          storageClassName: nfs-client

          accessModes: ["ReadWriteOnce"]

          resources:

            requests:

              storage: 50Gi

4.3.  Grafana persistence

Lu_jing: prometheus-operator / charts / grafana / valuesyaml

persistence:

  enabled: true

  storageClassName: "nfs-client"

  accessModes:

    - ReadWriteOnce

  size: 10Gi

4.4.  Auto Discovery Service

     - job_name: 'kubernetes-service-endpoints'

       kubernetes_sd_configs:

         - role: endpoints

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]

         action: keep

         regex: true

       - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]

         action: replace

         target_label: __scheme__

         regex: (https?)

       - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]

         action: replace

         target_label: __metrics_path__

         regex: (.+)

       - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]

         action: replace

         target_label: __address__

         regex: ([^:]+)(?::\d+)?;(\d+)

         replacement: $1:$2

       - action: labelmap

         regex: __meta_kubernetes_service_label_(.+)

       - source_labels: [__meta_kubernetes_namespace]

         action: replace

         target_label: kubernetes_namespace

       - source_labels: [__meta_kubernetes_service_name]

         action: replace

         target_label: kubernetes_name

     - job_name: 'kubernetes-pod'

       kubernetes_sd_configs:

         - role: pod

       relabel_configs:

       - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]

         action: keep

         regex: true

       - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]

         action: replace

         target_label: __metrics_path__

         regex: (.+)

       - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]

         action: replace

         regex: ([^:]+)(?::\d+)?;(\d+)

         replacement: $1:$2

         target_label: __address__

       - action: labelmap

         regex: __meta_kubernetes_pod_label_(.+)

       - source_labels: [__meta_kubernetes_namespace]

         action: replace

         target_label: kubernetes_namespace

       - source_labels: [__meta_kubernetes_pod_name]

         action: replace

         target_label: kubernetes_pod_name

     - job_name: istio-mesh

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-telemetry;prometheus

         replacement: $1

         action: keep

     - job_name: envoy-stats

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /stats/prometheus

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: pod

         namespaces:

           names: []

       relabel_configs:

       - source_labels: [__meta_kubernetes_pod_container_port_name]

         separator: ;

         regex: .*-envoy-prom

         replacement: $1

         action: keep

       - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]

         separator: ;

         regex: ([^:]+)(?::\d+)?;(\d+)

         target_label: __address__

         replacement: $1:15090

         action: replace

       - separator: ;

         regex: __meta_kubernetes_pod_label_(.+)

         replacement: $1

         action: labelmap

       - source_labels: [__meta_kubernetes_namespace]

         separator: ;

         regex: (.*)

         target_label: namespace

         replacement: $1

         action: replace

       - source_labels: [__meta_kubernetes_pod_name]

         separator: ;

         regex: (.*)

         target_label: pod_name

         replacement: $1

         action: replace

       metric_relabel_configs:

       - source_labels: [cluster_name]

         separator: ;

         regex: (outbound|inbound|prometheus_stats).*

         replacement: $1

         action: drop

       - source_labels: [tcp_prefix]

         separator: ;

         regex: (outbound|inbound|prometheus_stats).*

         replacement: $1

         action: drop

       - source_labels: [listener_address]

         separator: ;

         regex: (.+)

         replacement: $1

         action: drop

       - source_labels: [http_conn_manager_listener_prefix]

         separator: ;

         regex: (.+)

         replacement: $1

         action: drop

       - source_labels: [http_conn_manager_prefix]

         separator: ;

         regex: (.+)

         replacement: $1

         action: drop

       - source_labels: [__name__]

         separator: ;

         regex: envoy_tls.*

         replacement: $1

         action: drop

       - source_labels: [__name__]

         separator: ;

         regex: envoy_tcp_downstream.*

         replacement: $1

         action: drop

       - source_labels: [__name__]

         separator: ;

         regex: envoy_http_(stats|admin).*

         replacement: $1

         action: drop

       - source_labels: [__name__]

         separator: ;

         regex: envoy_cluster_(lb|retry|bind|internal|max|original).*

         replacement: $1

         action: drop

     - job_name: istio-policy

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-policy;http-monitoring

         replacement: $1

         action: keep

     - job_name: istio-telemetry

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-telemetry;http-monitoring

         replacement: $1

         action: keep

     - job_name: pilot

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-pilot;http-monitoring

         replacement: $1

         action: keep

     - job_name: galley

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-galley;http-monitoring

         replacement: $1

         action: keep

     - job_name: citadel

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: http

       kubernetes_sd_configs:

       - api_server: null

         role: endpoints

         namespaces:

           names:

           - istio-system

       relabel_configs:

       - source_labels: [__meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]

         separator: ;

         regex: istio-citadel;http-monitoring

         replacement: $1

         action: keep

     - job_name: kubernetes-pods-istio-secure

       scrape_interval: 15s

       scrape_timeout: 10s

       metrics_path: /metrics

       scheme: https

       kubernetes_sd_configs:

       - api_server: null

         role: pod

         namespaces:

           names: []

       tls_config:

         ca_file: /etc/istio-certs/root-cert.pem

         cert_file: /etc/istio-certs/cert-chain.pem

         key_file: /etc/istio-certs/key.pem

         insecure_skip_verify: true

       relabel_configs:

       - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]

         separator: ;

         regex: "true"

         replacement: $1

         action: keep

       - source_labels: [__meta_kubernetes_pod_annotation_sidecar_istio_io_status, __meta_kubernetes_pod_annotation_istio_mtls]

         separator: ;

         regex: (([^;]+);([^;]*))|(([^;]*);(true))

         replacement: $1

         action: keep

       - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]

         separator: ;

         regex: (http)

         replacement: $1

         action: drop

       - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]

         separator: ;

         regex: (.+)

         target_label: __metrics_path__

         replacement: $1

         action: replace

       - source_labels: [__address__]

         separator: ;

         regex: ([^:]+):(\d+)

         replacement: $1

         action: keep

       - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]

         separator: ;

         regex: ([^:]+)(?::\d+)?;(\d+)

         target_label: __address__

         replacement: $1:$2

         action: replace

       - separator: ;

         regex: __meta_kubernetes_pod_label_(.+)

         replacement: $1

         action: labelmap

       - source_labels: [__meta_kubernetes_namespace]

         separator: ;

         regex: (.*)

         target_label: namespace

         replacement: $1

         action: replace

       - source_labels: [__meta_kubernetes_pod_name]

         separator: ;

         regex: (.*)

         target_label: pod_name

         replacement: $1

         action: replace

4.5. etcd

For etcd clustered general, for security will open https certificate authentication way, so to get Prometheus access to etcd monitoring data cluster, we need to provide the appropriate certificate verification.

Since we use here is the presentation environment Kubeadm build clusters, we can use kubectl tools to get etcd certification path using the start time:

[root@cn-hongkong ~]# kubectl get pod etcd-cn-hongkong.i-j6caps6av1mtyxyofmrw -n kube-system -o yaml

 

We can see etcd certificates are used in the corresponding node of the / etc / kubernetes / pki / etcd the path below, so first we will need to use the certificate by secret stored in the cluster to target: ( in etcd node running )

1)  manually obtain etcd information

curl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/healthcheck-client.crt --key /etc/kubernetes/pki/etcd/healthcheck-client.key https://172.31.182.152:2379/metrics

2)  Use prometheus grab

kubectl -n monitoring create secret generic etcd-certs --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key --from-file=/etc/kubernetes/pki/etcd/ca.crt

3)  Add values.yaml file kubeEtcd configuration

## Component scraping etcd

##

kubeEtcd:

  enabled: true

  ## If your etcd is not deployed as a pod, specify IPs it can be found on

  ##

  endpoints: []

  ## Etcd service. If using kubeEtcd.endpoints only the port and targetPort are used

  ##

  service:

    port: 2379

    targetPort: 2379

    selector:

      component: etcd

  ## Configure secure access to the etcd cluster by loading a secret into prometheus and

  ## specifying security configuration below. For example, with a secret named etcd-client-cert

  ##

  serviceMonitor:

    scheme: https

    insecureSkipVerify: true

    serverName: localhost

    caFile: /etc/prometheus/secrets/etcd-certs/ca.crt

    certFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.crt

    keyFile: /etc/prometheus/secrets/etcd-certs/healthcheck-client.key

4)  The created above etcd-certs object configuration to prometheus (especially important)

    ## Secrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods.

    ## The Secrets are mounted into /etc/prometheus/secrets/. Secrets changes after initial creation of a Prometheus object are not

    ## reflected in the running Pods. To change the secrets mounted into the Prometheus Pods, the object must be deleted and recreated

    ## with the new list of secrets.

    ##

    secrets:

    - etcd-certs

After you install the certificate will appear in prometheus directory

 

 Crawl Custom Server 4.6

We need to build a ServiceMonitor, namespaceSelector: of any: true means to match all namespaces below have app = sscp-transaction this label label Service.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: sscp-transaction
    release: prometheus-operator
  name: springboot
  namespace: monitoring
spec:
  endpoints:
  - interval: 15s
    path: /actuator/prometheus
    port: health
    scheme: http
  namespaceSelector:
    any: true
#    matchNames:
#    - sscp-dev
  selector:
    matchLabels:
      app: sscp-transaction
#      release: sscp

Renderings:

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/Dev0ps/p/11465819.html