AlertManager realizes enterprise WeChat alarm (13)

AlertManager realizes enterprise WeChat alarm

1.AlertManager docks enterprise WeChat

The alarm methods supported by ALertManager include email alarm, Dingding alarm, and WeChat alarm. This time, enterprise WeChat alarm will be implemented.

2. Enterprise WeChat configuration

2.1. Register an enterprise WeChat

Insert picture description here

2.2. Create an alarm robot

Click Create Application in Application Management
Insert picture description here

2.3. Create department

After registering the company WeChat, there will be a department

Insert picture description here

2.4. Record important information for configuration

1. Write down the company id number

In my company-there will be a company id at the bottom

ww48f74fc8ed3a07ba

Insert picture description here

2. Record the department id number

部门id为1

Insert picture description here

3. Record the robot id and secret

AgentId:1000003
Secret:j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0

Find the robot in the application management
Insert picture description here

3. Configure prometheus

3.1. Integrated alertmanager

1.修改配置文件
[root@prometheus-server ~]# vim /data/prometheus/prometheus.yml 
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 192.168.81.210:9093

2.加载配置
[root@prometheus-server ~]# curl -XPOST 192.168.81.210:9090/-/reload

3.2. Write an alarm rule for disk alarms

1.添加规则
[root@prometheus-server ~]# vim /data/prometheus/rules/node.yml 
groups:
- name: node.rules
  rules:
  - alert: NodeFilessystemUsage
    expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"} *100) > 80
    for: 1m
    labels:
      serverity: warning
    annotations:
      summary: "主机 {
   
   { $labels.instance }} : {
   
   { $labels.mountpoint }} 磁盘使用率过高"
      description: "{
   
   { $labels.instance }} : {
   
   { $labels.mountpoint }} 磁盘使用率超过80% (当前值: {
   
   { $value }}) "
      
2.加载配置
[root@prometheus-server ~]# curl -XPOST 192.168.81.210:9090/-/reload      

4. Configure alertmanager to support WeChat alarm

4.1. Modify the main configuration file to add WeChat alarm

Profile introduction

receivers:
- name: 'wechat'						//定义接收者名称
  wechat_configs:				//使用微信配置
    - corp_id: 'ww48f74fc8ed3a07ba'				//填写企业id
      to_party: '1'								//部门id
      agent_id: '1000003'			//机器人应用id
      api_secret: 'j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0'		//机器人api secret值
      send_resolved: true
1.修改配置文件
[root@prometheus-server ~]# vim /data/alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m

templates:										#定义微信告警内容模板
  - '/data/alertmanager/wechat.tmpl'
route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 10m
  receiver: 'wechat'

receivers:
- name: 'wechat'
  wechat_configs:
    - corp_id: 'ww48f74fc8ed3a07ba'
      to_party: '1'
      agent_id: '1000003'
      api_secret: 'j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0'
      send_resolved: true

4.2. Write WeChat alert content template

[root@prometheus-server ~]# vim /data/alertmanager/wechat.tmpl 
{
   
   { define "wechat.default.message" }}
{
   
   { range $i, $alert :=.Alerts }}
========监控报警==========
告警状态:{
   
   {   .Status }}
告警级别:{
   
   { $alert.Labels.severity }}
告警类型:{
   
   { $alert.Labels.alertname }}
告警应用:{
   
   { $alert.Annotations.summary }}
告警主机:{
   
   { $alert.Labels.instance }}
告警详情:{
   
   { $alert.Annotations.description }}
触发阀值:{
   
   { $alert.Annotations.value }}
告警时间:{
   
   { $alert.StartsAt.Format "2006-01-02 15:04:05" }}
========end=============
{
   
   { end }}
{
   
   { end }}

5. Trigger disk alarm

Adjust the threshold of the alarm rule we wrote to trigger an alarm, change the threshold to 10

[root@prometheus-server ~]# vim /data/prometheus/rules/node.yml 
groups:
- name: node.rules
  rules:
  - alert: NodeFilessystemUsage
    expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"} *100) > 10
    for: 1m
    labels:
      serverity: warning
    annotations:
      summary: "主机 {
   
   { $labels.instance }} : {
   
   { $labels.mountpoint }} 磁盘使用率过高"
      description: "{
   
   { $labels.instance }} : {
   
   { $labels.mountpoint }} 磁盘使用率超过80% (当前值: {
   
   { $value }}) "

Has been triggered and sent an alarm

FIRING status indicates that the problem has occurred and an alarm has been sent
Insert picture description here

6. View WeChat alert messages

6.1. Alarms when problems occur

Since we have 3 node nodes, they will all be sent to the same alert message

If the alarm status is firing, it means that the problem has occurred and has not been dealt with

Insert picture description here

6.2. Alarm for problem solving

When the alarm status is resolved, the problem has been resolved and has returned to normal

The warning message is extremely rich, especially detailed

6. View WeChat alert messages

6.2. Alarm for problem solving

When the alarm status is resolved, the problem has been resolved and has returned to normal

The warning message is extremely rich, especially detailed

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44953658/article/details/113985409