AlertManager realizes enterprise WeChat alarm
1.AlertManager docks enterprise WeChat
The alarm methods supported by ALertManager include email alarm, Dingding alarm, and WeChat alarm. This time, enterprise WeChat alarm will be implemented.
2. Enterprise WeChat configuration
2.1. Register an enterprise WeChat
2.2. Create an alarm robot
Click Create Application in Application Management
2.3. Create department
After registering the company WeChat, there will be a department
2.4. Record important information for configuration
1. Write down the company id number
In my company-there will be a company id at the bottom
ww48f74fc8ed3a07ba
2. Record the department id number
部门id为1
3. Record the robot id and secret
AgentId:1000003
Secret:j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0
Find the robot in the application management
3. Configure prometheus
3.1. Integrated alertmanager
1.修改配置文件
[root@prometheus-server ~]# vim /data/prometheus/prometheus.yml
alerting:
alertmanagers:
- static_configs:
- targets:
- 192.168.81.210:9093
2.加载配置
[root@prometheus-server ~]# curl -XPOST 192.168.81.210:9090/-/reload
3.2. Write an alarm rule for disk alarms
1.添加规则
[root@prometheus-server ~]# vim /data/prometheus/rules/node.yml
groups:
- name: node.rules
rules:
- alert: NodeFilessystemUsage
expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"} *100) > 80
for: 1m
labels:
serverity: warning
annotations:
summary: "主机 {
{ $labels.instance }} : {
{ $labels.mountpoint }} 磁盘使用率过高"
description: "{
{ $labels.instance }} : {
{ $labels.mountpoint }} 磁盘使用率超过80% (当前值: {
{ $value }}) "
2.加载配置
[root@prometheus-server ~]# curl -XPOST 192.168.81.210:9090/-/reload
4. Configure alertmanager to support WeChat alarm
4.1. Modify the main configuration file to add WeChat alarm
Profile introduction
receivers:
- name: 'wechat' //定义接收者名称
wechat_configs: //使用微信配置
- corp_id: 'ww48f74fc8ed3a07ba' //填写企业id
to_party: '1' //部门id
agent_id: '1000003' //机器人应用id
api_secret: 'j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0' //机器人api secret值
send_resolved: true
1.修改配置文件
[root@prometheus-server ~]# vim /data/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
templates: #定义微信告警内容模板
- '/data/alertmanager/wechat.tmpl'
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 10m
receiver: 'wechat'
receivers:
- name: 'wechat'
wechat_configs:
- corp_id: 'ww48f74fc8ed3a07ba'
to_party: '1'
agent_id: '1000003'
api_secret: 'j3ocaGJJM7KejlqzBIJ38b6D6t9QhqlIAh7k4fA1cT0'
send_resolved: true
4.2. Write WeChat alert content template
[root@prometheus-server ~]# vim /data/alertmanager/wechat.tmpl
{
{ define "wechat.default.message" }}
{
{ range $i, $alert :=.Alerts }}
========监控报警==========
告警状态:{
{ .Status }}
告警级别:{
{ $alert.Labels.severity }}
告警类型:{
{ $alert.Labels.alertname }}
告警应用:{
{ $alert.Annotations.summary }}
告警主机:{
{ $alert.Labels.instance }}
告警详情:{
{ $alert.Annotations.description }}
触发阀值:{
{ $alert.Annotations.value }}
告警时间:{
{ $alert.StartsAt.Format "2006-01-02 15:04:05" }}
========end=============
{
{ end }}
{
{ end }}
5. Trigger disk alarm
Adjust the threshold of the alarm rule we wrote to trigger an alarm, change the threshold to 10
[root@prometheus-server ~]# vim /data/prometheus/rules/node.yml
groups:
- name: node.rules
rules:
- alert: NodeFilessystemUsage
expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs",mountpoint="/"} / node_filesystem_size_bytes{fstype=~"ext4|xfs",mountpoint="/"} *100) > 10
for: 1m
labels:
serverity: warning
annotations:
summary: "主机 {
{ $labels.instance }} : {
{ $labels.mountpoint }} 磁盘使用率过高"
description: "{
{ $labels.instance }} : {
{ $labels.mountpoint }} 磁盘使用率超过80% (当前值: {
{ $value }}) "
Has been triggered and sent an alarm
FIRING status indicates that the problem has occurred and an alarm has been sent
6. View WeChat alert messages
6.1. Alarms when problems occur
Since we have 3 node nodes, they will all be sent to the same alert message
If the alarm status is firing, it means that the problem has occurred and has not been dealt with
6.2. Alarm for problem solving
When the alarm status is resolved, the problem has been resolved and has returned to normal
The warning message is extremely rich, especially detailed
6. View WeChat alert messages
6.2. Alarm for problem solving
When the alarm status is resolved, the problem has been resolved and has returned to normal
The warning message is extremely rich, especially detailed