Teach you how to use Prometheus-Operator for K8s cluster monitoring

This article is shared from Huawei Cloud Community's " Promethues-operator Getting Started Guide " by the author: You can make a friend.

1. Background

In non-operator configuration, we monitor k8s clusters by configuring configmap for service discovery and indicator pulling. Switching to prometheus-operator will inevitably cause some usage problems. Many users have become accustomed to the automatic discovery of underlying configurations. When transitioning to servicemonitor or podmonitor it's more or less not getting used to. So let me introduce you to Prometheus-Operator and how to use servicemonitor.

2. Introduction to Prometheus-Operator

Prometheus Operator provides Kubernetes with a local deployment and management solution for Prometheus-related monitoring components. The purpose of this project is to simplify and automate the configuration of monitoring stacks based on Prometheus. It mainly includes the following functions:

  • kubernetes custom resources: Use kubernetes CRD to deploy and manage Prometheus, Alertmanager and related components

  • Simplified deployment configuration: configure Prometheus directly through the kubernetes resource list, such as version, persistence, copy, retention policy, etc.

  • Prometheus monitoring target configuration: Automatically generate monitoring target configuration based on well-known kubernetes tag query, no need to learn prometheus-specific configuration

2.1 Architecture

The following figure is the architecture diagram officially provided by Prometheus-Operator. Each component runs in the Kubernetes cluster in different ways. Operator is the core part. As a controller, it will create CRDs such as Prometheus, ServiceMonitor, AlertManager, and PrometheusRule. Resource objects, and then keep watching and maintaining the status of these resource objects.

image.png

The following three yaml files describe very well how prometheus is associated with servicemonitor and how servicemonitor is associated with selected target service.

image.png

In order to allow prom to monitor applications in k8s, Prometheus-Operator configures servicemonitor to match the Endpoints automatically populated by the service object, and configures prometheus to monitor the pods of these Endpoints backends. The Endpoints part of ServiceMonitor.Spec is used to configure which ports of Endpoints Will be scraped indicator.

The servicemonitor object is very clever. It decouples "monitoring requirements" and "requirements implementers". Servicemonitor only needs to use label-selector, a simple and versatile way to declare a "monitoring requirement", that is, which Endpoints need to be collected and how to collect them. Let users only care about their needs, which is a very good separation of concerns. Of course, servicemonitor will eventually be converted into the original complex scrape config by the operator, but this complexity has been completely blocked by the operator.

The figure below shows very well what resources prometheus needs to operate when configuring alarms, and the role of each resource.

image.png

First, obtain the monitoring indicators of the application by configuring servicemonitor/podmonitor;

The Prometheus.spec.alerting field will match the configuration in Alertmanager and match the alertmanager instance

Then configure alarm rules for the monitored indicators through prometheusrule;

Finally, configure the alarm receiver and configure alertmanagerconfig to configure how to handle alarms, including how to receive, route, suppress and send alarms, etc.;

2.2 Common CRDs

Prometheus, defines the required Prometheus deployment.

ServiceMonitor, which declaratively specifies how a Kubernetes service group should be monitored. The Operator automatically generates Prometheus fetching configurations based on the current state of objects in the API server.

PodMonitor, which declaratively specifies how a pod group should be monitored. The Operator automatically generates Prometheus fetching configurations based on the current state of objects in the API server.

PrometheusRule, defines a set of required Prometheus alerting and/or logging rules. The Operator generates a rules file that can be used by Prometheus instances.

Alertmanager, defines the required Alertmanager deployment.

AlertmanagerConfig, a subsection of the Alertmanager configuration that declaratively specifies, allowing alerts to be routed to custom receivers and suppressing rules to be set.

Probe, which declaratively specifies how an entry group or static target should be monitored. Operator automatically generates Prometheus scrape configuration based on definition. Used with blackbox exporter.

ThanosRuler, defines the required Thanos Ruler deployment.

3. Prometheus-Operator installation

Prometheus-Operator has requirements for the K8S cluster version. Please refer to the cluster version to select the corresponding Prometheus-Operator version code library: https://github.com/prometheus-operator/kube-prometheus

image.png

The environment used in this document is the 1.25k8s cluster corresponding to version 0.12.0 https://github.com/prometheus-operator/kube-prometheus/archive/refs/heads/release-0.12.zip

3.1 Installation

wget https://github.com/prometheus-operator/kube-prometheus/archive/refs/heads/release-0.12.zip
    unzip release-0.12.zip 
    cd kube-prometheus-release-0.12 
    kubectl apply --server-side -f manifests/setup 
    kubectl wait \ 
    --for condition=Established \ 
    --all CustomResourceDefinition \
    --namespace=monitoring 
    kubectl apply -f manifests/

image.png

#Note: The images of kube-state-metrics and prometheus-adapter are the images of Google's official library. There may be problems with not being able to pull them in China. If the pod is pending because the image cannot be pulled, please replace it with one that can be obtained. mirror address.

3.2 Uninstall

Note: This step is an uninstallation step. If you want to continue to keep Prometheus-Operator, please do not perform this step.kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

4. Use servicemonitor to monitor the indicators exposed by the application

Create a deployment object and service resources. The service port 8080 will expose its own indicators.

apiVersion: apps/v1 kind: Deployment metadata: labels: app: sample-metrics-app name: sample-metrics-app spec: replicas: 2 selector: matchLabels: app: sample-metrics-app template: metadata: labels: app: sample-metrics-app spec: tolerations: - key: beta.kubernetes.io/arch value: arm effect: NoSchedule - key: beta.kubernetes.io/arch value: arm64 effect: NoSchedule - key: node.alpha.kubernetes.io/unreachable operator: Exists effect: NoExecute tolerationSeconds: 0 - key: node.alpha.kubernetes.io/notReady operator: Exists effect: NoExecute tolerationSeconds: 0 containers: - image: luxas/autoscale-demo:v0.1.2 name: sample-metrics-app ports: - name: web containerPort: 8080 readinessProbe: httpGet: path: / port: 8080 initialDelaySeconds: 3 periodSeconds: 5 livenessProbe: httpGet: path: / port: 8080 initialDelaySeconds: 3 periodSeconds: 5 --- apiVersion: v1 kind: Service metadata: name: sample-metrics-app labels: app: sample-metrics-app spec: ports: - name: web port: 80 targetPort: 8080 selector: app: sample-metrics-app

Create a servicemonitor object to collect application indicators

apiVersion: monitoring.coreos.com/v1 
kind: ServiceMonitor 
metadata: 
  name: sample-metrics-app 
  labels: 
    service-monitor: sample-metrics-app 
spec: 
  selector: 
    matchLabels: 
      app: sample-metrics-app # The matching label is app :service endpoints of sample-metrics-app 
  : 
  - port: web #The port used by Promethues to collect indicators is the port represented by portName in service

View the newly created service and access the application through the service IP on the node in the cluster.kubectl get service

image.png

You can view the metrics exposed by the application by accessing the metrics interface of the service IP.curl 10.247.227.116/metrics

image.png

It can be seen that the indicator exposed by the application is "http_requests_total", and the number collected by monitoring is 805

Use the browser to access the Prometheus UI interface to view indicators. Access prometheus-server through IP and port to view servermonitor and indicator monitoring status.

image.png

image.png

You can see that the indicators exposed by the application have been successfully collected. Due to the indicator collection time interval, the number of indicators collected by prometheus is 800, and the number exposed by the application's metrics interface is 805.

 

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

IntelliJ IDEA 2023.3 & JetBrains Family Bucket annual major version update new concept "defensive programming": make yourself a stable job GitHub.com runs more than 1,200 MySQL hosts, how to seamlessly upgrade to 8.0? Stephen Chow's Web3 team will launch an independent App next month. Will Firefox be eliminated? Visual Studio Code 1.85 released, floating window US CISA recommends abandoning C/C++ to eliminate memory security vulnerabilities Yu Chengdong: Huawei will launch disruptive products next year and rewrite industry history TIOBE December: C# is expected to become the programming language of the year A paper written by Lei Jun 30 years ago : "Principle and Design of Computer Virus Determination Expert System"
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10321080