prometheus自定义邮件告警和自定义微信机器人告警

目录

————————————————————————————

一.自定义邮件告警

二.使用docker部署微信机器人告警

1.制作镜像

2.启动容器和指定webhook容器


————————————————————————————

一.自定义邮件告警

在alertmanager服务的配置文件中指定自定义告警文件

~]# vim alertmanager.yaml
global:
   resolve_timeout: 5m
   smtp_smarthost: 'smtp.qq.com:465'                                                      
   smtp_from: '[email protected]'                                                          
   smtp_auth_username: '[email protected]'                                                 
   smtp_auth_password: 'lxxxxxxxxtubdfd'                                                 
   smtp_require_tls: false                                                                
   smtp_hello: 'qq.com'

templates:
   - '/etc/alertmanager/template/*.tmpl'

route:
   group_by: ['alertname']
   group_wait: 10s
   group_interval: 10s
   repeat_interval: 1m
   receiver: 'email'

receivers:
   - name: 'email'
     email_configs:                                                                          
     - to: '[email protected]'                                                               
       html: '{
    
    { template "email.to.html" . }}'
       headers: { Subject: "告警" }
       send_resolved: true    

inhibit_rules:
   - source_match:
       severity: 'critical'
     target_match:
       severity: 'warning'
     equal: ['alertname', 'dev', 'instance']

书写自定义配置文件

{
    
    { define "email.to.html" }}
{
    
    { if gt (len .Alerts.Firing) 0 }}
{
    
    { range .Alerts }}
========  异常告警========== <br>
告警程序: Alertmanager <br>
告警主机: {
    
    { .Annotations.summary }} <br>
告警类型: {
    
    { .Annotations.alarmPolicyType }} <br>
告警级别: {
    
    { .Labels.severity }} <br>
告警状态: {
    
    { .Status }} <br>
告警详情: {
    
    { .Annotations.description }} <br>
触发时间: {
    
    { (.StartsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br>
==========end============= <br>
{
    
    { end }}
{
    
    { end }}
{
    
    { if gt (len .Alerts.Resolved) 0 }}
{
    
    { range .Alerts }}
======== <span style=color:#00FF00;font-size:11px;font-weight:bold;> 告警恢复 </span>==========<br>
告警程序: Alertmanager <br>
告警主机: {
    
    { .Annotations.summary }} <br>
告警类型: {
    
    { .Annotations.alarmPolicyType }} <br>
告警级别: {
    
    { .Labels.severity }} <br>
告警状态: {
    
    { .Status }} <br>
告警详情: {
    
    { .Annotations.description }} <br>
触发时间: {
    
    { (.StartsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br>
恢复时间: {
    
    { (.EndsAt.Add 28800e9) "2022-8-11 17:30:01" }} <br>
===========end============ <br>
{
    
    { end }}
{
    
    { end }}
{
    
    { end }}

 注:里面的标签需要根据内置的标签和自定义的标签来配置

扫描二维码关注公众号,回复: 15005846 查看本文章

二.使用docker部署微信机器人告警

注:这里的自定义告警用webhook来转发

1.制作镜像

1)准备webhook启动代码

注:里面的标签需要和自己的配置的相同

~]# vim app.py 
# -*- coding: utf-8 -*-
import os
import json
import requests
import arrow

from flask import Flask
from flask import request

app = Flask(__name__)

def bytes2json(data_bytes):
    data = data_bytes.decode('utf8').replace("'", '"')
    return json.loads(data)

def makealertdata(data):
    for output in data['alerts'][:]:
        try:
            pod_name = output['labels']['pod']
        except KeyError:
            try:
                pod_name = output['labels']['pod_name']
            except KeyError:
                pod_name = 'null'

        try:
            namespace = output['labels']['namespace']
        except KeyError:
            namespace = 'null'

        try:
            message = output['annotations']['message']
        except KeyError:
            try:
                message = output['annotations']['description']
            except KeyError:
                message = 'null'
        if output['status'] == 'firing':
            status_zh = '<font color=\"warning\">告警</font>'
            title = '【%s】告警 %s 有新的报警' % (status_zh, output['annotations']['alarmPolicyType'])
            send_data = {
                "msgtype": "markdown",
                "markdown": {
                    "content": "## %s \n\n" %title +
                            ">**告警级别**: %s \n\n" % output['labels']['severity'] +
                            ">**告警类型**: %s \n\n" % output['annotations']['metricDisplayName'] +
                            ">**告警主机**: %s \n\n" % output['annotations']['summary'] +
                            ">**告警负责人**: %s \n\n" % "<@v_keo>" +
                            ">**告警详情**: %s \n\n" % message +
                            ">**告警状态**: %s \n\n" % output['status'] +
                            ">**触发时间**: %s \n\n" % arrow.get(output['startsAt']).to('Asia/Shanghai').format(
                        'YYYY-MM-DD HH:mm:ss ZZ')
                }
            }
        elif output['status'] == 'resolved':
            status_zh = '<font color=\"info\">恢复</font>'
            title = '【%s】环境 %s 有报警恢复' % (status_zh, output['annotations']['alarmPolicyType'])
            send_data = {
                "msgtype": "markdown",
                "markdown": {
                    "content": "## %s \n\n" %title +
                            ">**告警级别**: %s \n\n" % output['labels']['severity'] +
                            ">**告警类型**: %s \n\n" % output['annotations']['metricDisplayName'] +
                            ">**告警主机**: %s \n\n" % output['annotations']['summary'] +
                            ">**告警负责人**: %s \n\n" % "<@v_keo>" +
                            ">**告警详情**: %s \n\n" % message +
                            ">**告警状态**: %s \n\n" % output['status'] +
                            ">**触发时间**: %s \n\n" % arrow.get(output['startsAt']).to('Asia/Shanghai').format(
                        'YYYY-MM-DD HH:mm:ss ZZ') +
                            ">**触发结束时间**: %s \n" % arrow.get(output['endsAt']).to('Asia/Shanghai').format(
                        'YYYY-MM-DD HH:mm:ss ZZ')
                }
            }
        return send_data
def send_alert(data):
    token = os.getenv('ROBOT_TOKEN')
    if not token:
        print('you must set ROBOT_TOKEN env')
        return
    url = 'https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=%s' % token

    send_data = makealertdata(data)
    req = requests.post(url, json=send_data)
    result = req.json()
    if result['errcode'] != 0:
        print('notify dingtalk error: %s' % result['errcode'])


@app.route('/', methods=['POST', 'GET'])
def send():
    if request.method == 'POST':
        post_data = request.get_data()
        send_alert(bytes2json(post_data))
        return 'success'
    else:
        return 'weclome to use prometheus alertmanager dingtalk webhook server!'


if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

2)准备python需要的模块文件

~]# vim requirements.txt 
certifi==2018.10.15
chardet==3.0.4
Click==7.0
Flask==1.0.2
idna==2.7
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
requests==2.20.1
urllib3==1.24.1
Werkzeug==0.14.1
arrow==0.13.1

3)书写dockerfile文件

~]# vim Dockerfile 
FROM python:3.6.4
# set working directory
WORKDIR /src
# add app
ADD . /src
# install requirements
RUN pip install selectivesearch -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
RUN pip install -r requirements.txt
EXPOSE 500
# run server
CMD python app.py

4)制作镜像

~]# docker build -f /data/prometheus/Dockerfile2/webkook/Dockerfile -t  webhook:v1 . --network=host
# -f : 指定dockerfile文件位置

2.启动容器和指定webhook容器

1)启动容器

~]# docker run  --name=webhook  --net=host -v /etc/localtime:/etc/localtime -v /data/prometheus/Dockerfile2/webkook/app.py:/src/app.py  -d  -e ROBOT_TOKEN=eb25f0c4-69ac-458c-af07-a6999beb05cd webhook:v1

# /etc/localtime   时间要一致
# -e ROBOT_TOKEN= 指定机器人 key

2)告警配置指定webhook

]# cat alertmanager.yaml-wx 

global:
       resolve_timeout: 5m

route:
      group_by: ['alertname']
      group_wait: 10s
      group_interval: 10s
      repeat_interval: 1m
      receiver: 'web.hook'

receivers:
    - name: 'web.hook'
      webhook_configs:
      - url: 'http://起容器的ip:5000'
        send_resolved: true
inhibit_rules:
      - source_match:
           severity: 'critical'
        target_match:
           severity: 'warning'
        equal: ['alertname', 'dev', 'instance']

注:如果转发不了记得把端口映射出来

猜你喜欢

转载自blog.csdn.net/kali_yao/article/details/126412479