k8s-promether alarm rules and alertmanager alarm configuration

Due to work needs, our company uses the kube-prometheus package, which is not installed in binary mode, which saves a lot of trouble and encountered a lot of pits during testing. Record here
1 prometheus rules to write
this rule most of the online Yes, here I just use kube-prometheus to write rules and methods to record and
edit prometheus-rules.yaml (this file is in
the /manifests directory after kube-prometheus has cloned the code ), and write the following at the bottom, currently writing It is good to detect the running status of Pod and the
function of node node status. Others are similar, add the following content
#Detect pod status

  • alert: pod-status
    annotations:
    message: pod-{{ $labels.pod }}故障
    expr: |
    kube_pod_container_status_running != 1
    for: 1m
    labels:
    severity: warning

    #Detect node node status

    • alert: node-status
      annotations:
      message: node-{{ $labels.hostname }} failure
      expr: |
      kube_node_status_condition{status="unknown",condition="Ready"} == 1
      for: 1m
      labels:
      severity: warning
      last saved quit
      and then edit alertmanager-secret.yaml file, which is mainly configured to send mail or nails, I'm
      here is the way to nail alarm, e-mail is also equipped with, but not used, the message now rarely see, so direct
      nail After checking the alarm, replace the following with the original, as follows:
      apiVersion: v1
      data: {}
      kind: Secret
      metadata:
      name: alertmanager-main
      namespace: monitoring
      stringData:
      alertmanager.yaml: |-
      global:
      resolve_timeout: 1m # Processing timeout
      smtp_smarthost:'smtp.9icaishi.net:25' # Email smtp server proxy
      smtp_from:'[email protected]' # Sending email name
      smtp_auth_username:'[email protected]' # Email name
      smtp_auth_password:'Zabbix9icaishi2015' # Authorization password
      smtp_require_tls: false # Do not open tls and open by default

    receivers:

    • name: 'webhook'
      webhook_configs:

      • url: ' http://webhook-dingtalk/dingtalk/send/ ' #DingTalk alarm connection, this will be
        deployed separately for a while, because the default alarm content sent by alertmanager cannot be recognized by
        DingTalk , you need to convert send_resolved: true
        route:
        group_interval: 1m # Waiting time before sending a new alarm
        group_wait: 10s # Initially, how long is the first time to send a group of alarm notification
        receiver: webhook
        repeat_interval: 1m # The period of sending repeated alarms
        type: Opaque and
        finally save and exit .
        Now a pod to deploy nails alarm
        here to thank http://www.mamicode.com/info-detail-2845201.html author, I was in this group
        on the basis custom alarm script to the next, in line with our alarm content , I changed it as shown below: the
        original alarm diagram: the
        k8s-promether alarm rules and alertmanager alarm configuration
        changed
        k8s-promether alarm rules and alertmanager alarm configuration
        k8s-promether alarm rules and alertmanager alarm configuration
        changes are in line with our script content as follows: it is the app.py script
        
        #!/usr/bin/env python
        import time,io, sys,arrow,os

      sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding='utf-8')
      sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding='utf-8')

    from flask import Flask, Response
    from flask import request
    import requests
    import logging
    import json
    import locale
    #locale.setlocale(locale.LC_ALL,"en_US.UTF-8")

    app = Flask(name)

    console = logging.StreamHandler()
    fmt = '%(asctime)s - %(filename)s:%(lineno)s - %(name)s - %(message)s'
    formatter = logging.Formatter(fmt)
    console.setFormatter(formatter)
    log = logging.getLogger("flask_webhook_dingtalk")
    log.addHandler(console)
    log.setLevel(logging.DEBUG)

    EXCLUDE_LIST = ['prometheus', 'endpoint']

@app.route('/')
def index():
return 'Webhook Dingtalk by Billy https://blog.51cto.com/billy98'

@app.route('/dingtalk/send/',methods=['POST'])

def hander_session():

profile_url = sys.argv[1]
post_data = request.get_data()
post_data = json.loads(post_data.decode("utf-8"))['alerts']
post_data = post_data[0]
messa_list = []
if post_data['status'].upper() == "FIRING":
   messa_list.append('### 报警名称: Prometheus-alert')
   messa_list.append('**报警状态: 异常**')
   messa_list.append('**报警时间: %s**' % arrow.get(post_data['startsAt']).to('Asia/Shanghai').format('YYYY-MM-DD HH:mm:ss ZZ'))
   messa_list.append('**报警级别: %s**' % post_data['labels']['severity'])
   messa_list.append('**报警类型: %s**' % post_data['labels']['alertname'])
   messa_list.append('**报警详情: %s**' % post_data['annotations']['message'])
   messa = (' \\n\\n > '.join(messa_list))
else:
   messa_list.append('### 报警名称: Prometheus-alert')
   messa_list.append('**报警状态: 恢复**')
   messa_list.append('**报警时间: %s**' % arrow.get(post_data['startsAt']).to('Asia/Shanghai').format('YYYY-MM-DD HH:mm:ss ZZ'))
   messa_list.append('**恢复时间: %s**' % arrow.get(post_data['endsAt']).to('Asia/Shanghai').format('YYYY-MM-DD HH:mm:ss ZZ'))
   messa_list.append('**报警级别: %s**' % post_data['labels']['severity'])
   messa_list.append('**报警类型: %s**' % post_data['labels']['alertname'])
   messa_list.append('**报警详情: %s**' % post_data['annotations']['message'])
   messa = (' \\n\\n > '.join(messa_list))
status = alert_data(messa, post_data['labels']['alertname'], profile_url )
log.info(status)
return status

def alert_data(data,title,profile_url):
headers = {'Content-Type':'application/json'}
send_data = '{"msgtype": "markdown","markdown": {"title": \"%s\" ,"text": \"%s\" }}' %(title,data) # type: str
send_data = send_data.encode('utf-8')
reps = requests.post(url=profile_url, data=send_data, headers=headers)
return reps.text

if name == ' main ':
app.debug = False
app.run(host='0.0.0.0', port='8080')
Finally, just re-make a mirror. According to the
content of Dockerfile as follows:
FROM centos:7 as build
MAINTAINER billy98 [email protected]
RUN mkdir /root/.pip
ADD pip.conf /root/.pip/pip.conf

RUN curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo && yum install -y python36 python36-pip && pip3.6 install flask requests werkzeug arrow requests
ADD app.py /usr/local/alert-dingtalk.py

FROM gcr.io/distroless/python3
COPY --from=build /usr/local/alert-dingtalk.py /usr/local/alert-dingtalk.py
COPY --from=build usr/local/lib64/python3.6/ site-packages usr/local/lib64/python3.6/site-packages
COPY --from=build usr/local/lib/python3.6/site-packages usr/local/lib/python3.6/site-packages
ENV PYTHONPATH =usr/local/lib/python3.6/site-packages:usr/local/lib64/python3.6/site-packages EXPOSE
8080
ENTRYPOINT ["python","/usr/local/alert-dingtalk.py"]
Finally Just change your dingding.yaml or other file name to k8s, just deploy it.
For more k8s-related or automated operation and maintenance, please go to www.wangshuying.cn website to see
more knowledge points about operation and maintenance.

Guess you like

Origin blog.51cto.com/461884/2542434