prometheus+alertmanager+webhook+钉钉+邮件通知

prometheus+alertmanager+webhook+钉钉+邮件通知

前言:

  1. alertmanager不支持直接将消息发送给钉钉,所以通过prometheus-webhook-dingtalk插件将prometheus的消息转换为可用信息,给alertmanager使用
  2. 本文提供邮件通知和钉钉通知两种方式
prometheus-webhook-dingtalk
dp.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-webhook-dingtalk
  namespace: infra
  labels:
    app: prometheus-webhook-dingtalk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-webhook-dingtalk
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: prometheus-webhook-dingtalk
    spec:
      containers:
      - args:
        - --ding.profile='ops_dingding=https://oapi.dingtalk.com/robot/send?access_token=xxxx'
        name: prometheus-webhook-dingtalk
        image: harbor.yutang.cn/infra/prometheus-webhook-dingtalk:v1.4.0
        ports:
        - containerPort: 8060
      imagePullSecrets:
        - name: harbor
svc.yaml

apiVersion: v1
kind: Service
metadata:
  namespace: infra
  name: prometheus-webhook-dingtalk
  labels:
    app: prometheus-webhook-dingtalk
spec:
  selector:
    app: prometheus-webhook-dingtalk
  ports:
  - name: dingtalk-port
    port: 8060
    targetPort: 8060
    protocol: TCP
alertmanager
cm-dingding.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: alertmanager-config
  namespace: infra
data:
  config.yml: |-
      global:
        resolve_timeout: 5m
      route:
        receiver: webhook
        group_wait: 3s
        group_interval: 5m
        repeat_interval: 5s
        group_by: ['alertname', 'cluster']
        routes:
        - receiver: webhook
          group_wait: 10s
          match:
            team: node
      receivers:
      - name: webhook
        webhook_configs:
        - url: "http://prometheus-webhook-dingtalk:8060/dingtalk/ops_dingding/send"
          send_resolved: true
cm-mail.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: alertmanager-config
  namespace: infra
data:
  config.yml: |-
    global:
      # 在没有报警的情况下声明为已解决的时间
      resolve_timeout: 5m
      # 配置邮件发送信息
      smtp_smarthost: 'smtp.163.com:25'
      smtp_from: '[email protected]'
      smtp_auth_username: '[email protected]'
      smtp_auth_password: 'xxxxx'
      smtp_require_tls: false
    # 所有报警信息进入后的根路由,用来设置报警的分发策略
    route:
      # 这里的标签列表是接收到报警信息后的重新分组标签,例如,接收到的报警信息里面有许多具有 cluster=A 和 alertname=LatncyHigh 这样的标签的报警信息将会批量被聚合到一个分组里面
      group_by: ['alertname', 'cluster']
      # 当一个新的报警分组被创建后,需要等待至少group_wait时间来初始化通知,这种方式可以确保您能有足够的时间为同一分组来获取多个警报,然后一起触发这个报警信息。
      group_wait: 30s

      # 当第一个报警发送后,等待'group_interval'时间来发送新的一组报警信息。
      group_interval: 5m

      # 如果一个报警信息已经发送成功了,等待'repeat_interval'时间来重新发送他们
      repeat_interval: 5m

      # 默认的receiver:如果一个报警没有被一个route匹配,则发送给默认的接收器
      receiver: default

    receivers:
    - name: 'default'
      email_configs:
      - to: '[email protected]'
        send_resolved: true
dp.yaml

kind: Deployment
metadata:
  name: alertmanager
  namespace: infra
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alertmanager
  template:
    metadata:
      labels:
        app: alertmanager
    spec:
      containers:
      - name: alertmanager
        image: harbor.yutang.cn/infra/alertmanager:v0.19.0
        args:
          - "--config.file=/etc/alertmanager/config.yml"
          - "--storage.path=/alertmanager"
          - "--cluster.advertise-address=0.0.0.0:9093"
        ports:
        - name: alertmanager
          containerPort: 9093
        volumeMounts:
        - name: alertmanager-cm
          mountPath: /etc/alertmanager
      volumes:
      - name: alertmanager-cm
        configMap:
          name: alertmanager-config
      imagePullSecrets:
      - name: harbor
svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: alertmanager
  namespace: infra
spec:
  selector: 
    app: alertmanager
  ports:
    - port: 80
      targetPort: 9093
ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata: 
  name: alertmanager-web
  namespace: infra
spec:
  rules:
  - host: alertmanager.dayutang.cn
    http:
      paths:
      - path: /
        backend: 
          serviceName: alertmanager
          servicePort: 80

猜你喜欢

转载自www.cnblogs.com/ipyanthony/p/12619817.html