Prometheus + Altermanager staple alarm

Prometheus + Altermanager staple alarm

First, add nail robot

Reference nails official document: https://ding-doc.dingtalk.com/doc#/serverapi2/qf2nxq

 

 

 Second, the deployment of nails alarm deployment on k8s, here we refer to third-party plug-ins.

[root@cn-hongkong webhook-dingtalk]# cat webhook-dingtalk.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  labels:
    app: webhook-dingtalk
  name: webhook-dingtalk
  namespace: monitoring
  #需要和alertmanager在同一个namespace
spec:
  replicas: 1
  selector:
    matchLabels:
      app: webhook-dingtalk
  template:
    metadata:
      labels:
        app: webhook-dingtalk
    spec:
      containers:
      - image: billy98/webhook-dingtalk:latest
        name: webhook-dingtalk
        args:
        - "https://oapi.dingtalk.com/robot/send?access_token=1fd59067ab85bea575122a5e4f05cefd6609d9d3e41a725e46a90c2fad9b3"
        #上面创建的钉钉机器人hook
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 500m
            memory: 500Mi
        livenessProbe:
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
          tcpSocket:
            port: 8080
        readinessProbe:
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
          httpGet:
            port: 8080
            path: /
      imagePullSecrets:
        - name: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: webhook-dingtalk
  name: webhook-dingtalk
  namespace: monitoring
  #需要和alertmanager在同一个namespace
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: webhook-dingtalk
  type: ClusterIP 

 Three, alertmanager add nail alarm type

  config:
    global:
      resolve_timeout: 5m
    route:
      group_by: ['job','severity']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: default
      receiver: webhook
      routes:
      - match:
          filesystem: node
        receiver: webhook
    receivers:
    - name: webhook
      webhook_configs:
      - url: http://webhook-dingtalk/dingtalk/send/
        send_resolved: true

 Fourth, view the alarm information

 

 

Guess you like

Origin www.cnblogs.com/Dev0ps/p/11916963.html
Recommended