AlertManager alarm distribution routing configuration (ten)

AlertManager alarm distribution routing configuration

1.Introduction to route configuration file

route:
  group_by: ['alertname']								//定义分组
  group_wait: 10s								//分组等待时间
  group_interval: 10s						//收到告警后多长时间发送给接收者
  repeat_interval: 10m							//重复告警间隔
  receiver: 'yunwei'							//默认邮箱
  routes:									//启用一个子路由
  - receiver: 'dba'								//接收者为dba
    group_wait: 10s									//分组等待时间
    match_re:								//匹配一个正则
      service: mysql|db						//service标签包含mysql和db的统一发送给dba的邮箱
  - receiver: 'yunwei'					//接收者为yunwei
    group_wait: 10s							//分组时间
    match_re:		
      service: error								//将service标签值包含error的发送给yunwei的邮箱
	
receivers:							//定义接收者的邮箱
- name: 'yunwei'							//接收者名字,要和routes中的receiver对应
  email_configs:				
  - to: '[email protected]'						//yunwei的邮箱地址
- name: 'dba'								//接收者名字,要和routes中的receiver对应
  email_configs:		
  - to: '[email protected]'						//dba的邮箱地址

2. Requirements description

Requirement: The server tag contains the mailboxes sent to dba for db and mysql, and all others are sent to the mailboxes of operation and maintenance

The final effect: request that alarms about mysql and other databases are sent to dba, and other information is sent to operation and maintenance

Insert picture description here

3.Alertmanager realizes that different alarm contents are sent to different recipients

3.1. Modify the configuration file

1.修改配置文件
[root@prometheus-server /data/AlertManager]# vim AlertManager.yml  
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'yzjqxhsranbpdijd'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 10m
  receiver: 'yunwei'
  routes:
  - receiver: 'dba'
    group_wait: 10s
    match_re:
      service: mysql|db
  - receiver: 'yunwei'
    group_wait: 10s
    match_re:
      serverity: error

receivers:
- name: 'yunwei'
  email_configs:
  - to: '[email protected]'
- name: 'dba'
  email_configs:
  - to: '[email protected]'

2.重启生效
[root@prometheus-server /data/AlertManager]# ps aux | grep alert | grep -v grep | awk '{print $2}' |xargs kill -HUP

Insert picture description here

Configuration has taken effect

Insert picture description here

3.2. Define mysql alarm rules

3.2.1. Open node_exporter to monitor mysql

Add mysql service monitoring directly at startup

[root@192_168_81_220 ~]# vim /usr/lib/systemd/system/node_exporter.service
ExecStart=/data/node_exporter/node_exporter --collector.systemd --collector.systemd.unit-whitelist=(docker|sshd|node_exporter|mariadb).service

[root@192_168_81_220 ~]# systemctl daemon-reload 
[root@192_168_81_220 ~]# systemctl restart node_exporter.service 

Check if the page already has mysql monitoring

node_systemd_unit_state{name="mariadb.service",state="active"}

Insert picture description here

3.2.2. Write alarm rules

1.编写规则
[root@prometheus-server /data/prometheus]# vim rules/hostdown.yml 
groups:
- name: general.rules
  rules:
  - alert: 主机宕机
    expr: up == 0
    for: 1m
    labels:
      serverity: error
    annotations:
      summary: "主机 {
   
   { $labels.instance }} 停止工作"
      description: "{
   
   { $labels.instance }} job {
   
   { $labels.job }} 已经宕机5分钟以上!"

  - alert: mysql服务器异常
    expr: up{job="mysql"} == 0
    labels:
      service: mysql
    annotations:
      summary: "主机 {
   
   { $labels.instance }} mysql 停止工作"
      description: "{
   
   { $labels.instance }} mysql服务已经异常5分钟以上!"

2.加载配置
[root@prometheus-server /data/prometheus]# curl -XPOST 192.168.81.210:9090/-/reload

3.3. Trigger mysql alarm

3.3.1. Stop mysql_exporter on 192.168.81.220

[root@192_168_81_220 ~]# ps aux | grep mysql_exporter | awk '{print $2}' |xargs kill -9 

3.3.2.alert The alarm has been sent

Insert picture description here

3.3.3.error has been sent to the operation and maintenance mailbox

Insert picture description here

3.3.4. mysql has been sent to dba mailbox

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44953658/article/details/113777167