Prometheus 通过钉钉告警

一:创建钉钉告警机器人

一:创建钉钉告警机器人
1.在PC版钉钉上打开您想要添加报警机器人的钉钉群,并单击右上角的群设置图标。

2.在群设置面板中单击智能群助手。

3.在智能群助手面板单击添加机器人。

4.在群机器人对话框单击添加机器人区域的+图标,然后选择添加自定义。

5. 在机器人详情对话框单击添加。

扫描二维码关注公众号,回复: 14607214 查看本文章

6. 在添加机器人对话框中编辑机器人头像和名称,选中必要的安全设置(至少选择一种),选中我已阅读并同意《自定义机器人服务及免责条款》。单击完成。

二:安装钉钉告警插件

1. 下载插件
https://github.com/timonwong/prometheus-webhook-dingtalk/releases/
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
2. 安装
tar -xvf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local
cd /usr/local
mv prometheus-webhook-dingtalk-2.1.0.linux-amd64 prometheus-webhook-dingtalk

3.修改钉钉告警插件配置文件

## Request timeout
# timeout: 5s

## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true

## Customizable templates path
templates:
  - contrib/templates/*.tmpl # 这里指向你生成的模板

## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
#  title: '{
   
   { template "legacy.title" . }}'
#  text: '{
   
   { template "legacy.content" . }}'

## Targets, previously was known as "profiles"
targets:
  webhook1:
    # 钉钉机器人的webhook
    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    # secret for signature 加签后得到的值
    secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#  webhook2:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#  webhook_legacy:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    # Customize template content
#    message:
#      # Use legacy template
#      title: '{
   
   { template "legacy.title" . }}'
#      text: '{
   
   { template "legacy.content" . }}'
#  webhook_mention_all:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      all: true
#  webhook_mention_users:
#    url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
#    mention:
#      mobiles: ['156xxxx8827', '189xxxx8325']

钉钉告警模板(这个模板是在钉钉报警插件中使用的)

{
   
   { define "__subject" }}
[{
   
   { .Status | toUpper }}{
   
   { if eq .Status "firing" }}:{
   
   { .Alerts.Firing | len }}{
   
   { end }}]
{
   
   { end }}


{
   
   { define "__alert_list" }}{
   
   { range . }}
---
{
   
   { if .Labels.owner }}@{
   
   { .Labels.owner }}{
   
   { end }}

**告警名称**: {
   
   { index .Annotations "title" }} 

**告警级别**: {
   
   { .Labels.severity }} 

**告警主机**: {
   
   { .Labels.instance }} 

**告警信息**: {
   
   { index .Annotations "description" }}

**告警时间**: {
   
   { dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{
   
   { end }}{
   
   { end }}

{
   
   { define "__resolved_list" }}{
   
   { range . }}
---
{
   
   { if .Labels.owner }}@{
   
   { .Labels.owner }}{
   
   { end }}

**告警名称**: {
   
   { index .Annotations "title" }}

**告警级别**: {
   
   { .Labels.severity }}

**告警主机**: {
   
   { .Labels.instance }}

**告警信息**: {
   
   { index .Annotations "description" }}

**告警时间**: {
   
   { dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}

**恢复时间**: {
   
   { dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{
   
   { end }}{
   
   { end }}


{
   
   { define "default.title" }}
{
   
   { template "__subject" . }}
{
   
   { end }}

{
   
   { define "default.content" }}
{
   
   { if gt (len .Alerts.Firing) 0 }}
**====侦测到{
   
   { .Alerts.Firing | len  }}个故障====**
{
   
   { template "__alert_list" .Alerts.Firing }}
---
{
   
   { end }}

{
   
   { if gt (len .Alerts.Resolved) 0 }}
**====恢复{
   
   { .Alerts.Resolved | len  }}个故障====**
{
   
   { template "__resolved_list" .Alerts.Resolved }}
{
   
   { end }}
{
   
   { end }}


{
   
   { define "ding.link.title" }}{
   
   { template "default.title" . }}{
   
   { end }}
{
   
   { define "ding.link.content" }}{
   
   { template "default.content" . }}{
   
   { end }}
{
   
   { template "default.title" . }}
{
   
   { template "default.content" . }}

4.启动钉钉报警插件

./prometheus-webhook-dingtalk --config.file=config.yml >dingtalk.log 2>&1 &

#默认使用 8060 端口
netstat -ntlp|grep 8060

三. 修改alertmanager配置文件(添加钉钉告警渠道)

vi /usr/local/alertmanager/alertmanager.yml

1.routes 部分添加如下部分

#通过正则表达式指定告警名称为Mysql开头,或者告警名称为 Memory Usage 的告警通过 dingding.webhook1 发送

  routes:
  - receiver: 'dingding.webhook1'
    match_re:
      alertname: "Mysql.*|Memory Usage"

2. receivers 添加如下部分

receivers:
- name: 'dingding.webhook1'
  webhook_configs:
  - url: 'http://119.8.238.94:8060/dingtalk/webhook1/send' #这里的webhook1,根据我们在钉钉告警插件配置文件中targets中指定的值做修改
    send_resolved: true

3.重新加载 alertmanager 参数
curl -lv -X POST http://localhost:9093/-/reload

4.测试告警
1) 查看是否生成告警
http://119.8.238.94:9090/alerts?search=

2) 查看告警是否通过指定渠道发送

#告警通过钉钉发送成功(prometheus 的告警规则中指定了默认告警方式为邮件告警,但是如果告警名称符合Mysql开头的告警,或者告警名称为Memory Usage则通过dingding.webhook1渠道发送)

告警发送

 告警恢复

猜你喜欢

转载自blog.csdn.net/shaochenshuo/article/details/126700256