Use prometheus to monitor spring cloud

Get into the habit of writing together! This is the sixth day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

prometheus

Prometheus, a Cloud Native Computing Foundation project, is a system and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays results, and triggers alerts when specified conditions are observed.

What differentiates Prometheus from other metrics and monitoring systems are:

  • Multidimensional data model (time series defined by metric name and set of key/value dimensions)
  • PromQL, a powerful and flexible query language that leverages this dimension
  • Does not rely on distributed storage; individual server nodes are autonomous
  • HTTP pull model for time series collection
  • ****Support push timeseries through an intermediate gateway for batch jobs
  • Discover targets through service discovery or static configuration
  • Multiple modes supported by graphs and dashboards
  • Support for hierarchical and horizontal unions

Architecture Overview

Install

Download the latest version of Prometheus for your platform , then unzip and run it:

tar xvfz prometheus-*.tar.gz
cd prometheus-*
复制代码

Before starting Prometheus, let's configure it.

prometheus.yml

# my global config
global:
  scrape_interval:     10s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 10s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - 127.0.0.1:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules/*.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # 配置job,可以同时配置多个服务
  - job_name: 'name1'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    scheme: "https"
    static_configs:
      - targets: ['xxx.com']

  - job_name: 'name2'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 5s
    scheme: "http"
    static_configs:
      - targets: ['abc.com']
复制代码

Monitoring alarm rules

rules/xx_rules.yml

groups:
- name: nqi-down
  rules:
  - alert: gx-node-down
    expr: up{instance="xxx.com:443"} == 0
    for: 10s
    labels:
      status: High
      team: xxx
    annotations:
      description: "xxx is Down ! ! !"
      summary:  "xxx服务停了,请留意!!!"
      
  - alert: test-node-down
    expr: up{instance="abc.com:80"} == 0
    for: 5s
    labels:
      status: Warn
      team: test
    annotations:
      description: "abc is Down ! ! !"
      summary:  "abc服务停了,请留意!!!"
复制代码

Install alertmanager and extract it

tar xvfz alertmanager-*.tar.gz
cd alertmanager-*
复制代码

configure

alertmanager.yml

global: 
  resolve_timeout: 5m #解析的超时时间
  smtp_smarthost: 'smtp.xxx.com:465' #邮箱smtp地址
  smtp_from: '[email protected]' #来自哪个邮箱发出的
  smtp_auth_username: '[email protected]' #邮箱的用户名
  smtp_auth_password: 'W7CKmqD2x0iGXM9R' #这里是邮箱的授权密码,不是登录密码
  smtp_require_tls: false #是否启用tls

templates:
  - ./email.html

route:
  group_by: ['abc']
  group_wait: 3s
  group_interval: 10s
  repeat_interval: 3h
  receiver: 'mail'
receivers:
- name: 'mail'
  email_configs: #email的配置
  - to: '[email protected], [email protected]' #报警接收人的邮件地址
    send_resolved: true  #发送恢复通知
    html: '{{ template "email.html" . }}'
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']
复制代码

code

add pom dependency

<!-- 监控 -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
复制代码

application.properties increase configuration

#监控
management.endpoints.web.exposure.include=prometheus
复制代码

Install Grafana

wget <https://dl.grafana.com/enterprise/release/grafana-enterprise-8.4.6.linux-amd64.tar.gz>\
tar -zxvf grafana-enterprise-8.4.6.linux-amd64.tar.gz
复制代码

Grafana reference configuration

Add data source

pro.png

gra.png

import dashboard

import.png

dashboards template market

dash.png

dashb.png

Reference project

Guess you like

Origin juejin.im/post/7087849471849005092