Prometheus study notes (6) Alertmanager alarm

A, Alertmanager Profile

Prometheus is a division of platforms, collect and store the alarm metrics are separate, the alarm is the responsibility of the Alertmanager, which is a separate part of the monitoring environment. Alert rule is defined in the Prometheus server, these rules can trigger time and then spread to alertmanager, alertmanager then decide how to deal with each alert, deal with the problem like copy, and decide what mechanism to use when sending an alert : instant messaging, e-mail or other nails, micro-letters and other tools.

Two, Alertmanager deployment

Alertmanager default listening port 9093, the port cluster answer 9094.

# 下载
[root@prometheus ~]# wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0-rc.0/alertmanager-0.20.0-rc.0.linux-amd64.tar.gz

# 解压
[root@prometheus ~]# tar -zxf alertmanager-0.20.0-rc.0.linux-amd64.tar.gz -C /usr/local/
[root@prometheus ~]# mv /usr/local/alertmanager-0.20.0-rc.0.linux-amd64 /usr/local/alertmanager-0.20.0
[root@prometheus ~]# ln -sv /usr/local/alertmanager-0.20.0 /usr/local/alertmanager

# 运行
[root@prometheus ~]# ln -sv /usr/local/alertmanager/alertmanager /usr/local/bin/
[root@prometheus ~]# alertmanager &
[root@prometheus ~]# netstat -tulnp |grep alert
tcp6       0      0 :::9093                 :::*                    LISTEN      41194/alertmanager  
tcp6       0      0 :::9094                 :::*                    LISTEN      41194/alertmanager  
udp6       0      0 :::9094                 :::*                                41194/alertmanager  

Visit http: // : 9093 alertmanager to access the web interface, as follows:

Three, Alertmanager configuration

Alertmanager disposed in two places, one of which is disposed an alarm Prometheus server node, which specifies the file path mismatch alarm rules, and monitoring alertmanager itself. Another direct configuration alertmanager own configuration, configured in alertmanager.yml.

[root@prometheus alertmanager]# cat /usr/local/prometheus/prometheus.yml 
...

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
        - 192.168.0.143:9093    #配置alertmanager节点列表

rule_files:
   - "rules/*_rules.yml"    #指定规则文件
#   - "rules/*_alert.yml"

scrape_configs:
......

  - job_name: 'alertmanager'    #指定监控任务alertmanager
    static_configs:
    - targets: ['192.168.0.143:9093']

After the addition is complete, the web end prometheus server can view the list of targets to alertmanager, as follows:

Once configured prometheus.yml, look at the default presentation alertmanager.yml, as follows:

[root@prometheus alertmanager]# cat alertmanager.yml 
global:
  resolve_timeout: 5m    #处理超时时间,默认为5min

route:
  group_by: ['alertname']    # 报警分组依据
  group_wait: 10s    # 最初即第一次等待多久时间发送一组警报的通知
  group_interval: 10s    # 在发送新警报前的等待时间
  repeat_interval: 1h    # 发送重复警报的周期 对于email配置中,此项不可以设置过低,否则将会由于邮件发送太多频繁,被smtp服务器拒绝
  receiver: 'web.hook'    # 发送警报的接收者的名称,以下receivers name的名称

receivers:
- name: 'web.hook'    # 警报
  webhook_configs:    # webhook配置
  - url: 'http://192.168.0.143:5001/'

inhibit_rules:    # 一个inhibition规则是在与另一组匹配器匹配的警报存在的条件下,使匹配一组匹配器的警报失效的规则。两个警报必须具有一组相同的标签。 
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']
  • global: global configuration, including a timeout after the alarm solve, SMTP configuration, various channels API addresses, etc. notification.

  • route: distribution policy is used to set the alarm, it is a tree structure, depth-first match in the order from left to right.

  • receivers: configure alarm message receiver information, such as conventional email, wechat, slack, webhook other message notification mode.

  • inhibit_rules: suppression rule configuration, when there is an alarm (source) to another set of matched, suppression rules disables alarm and a set of matching (target).

Guess you like

Origin www.cnblogs.com/linuxk/p/12036193.html
Recommended