【转帖】【Alerting】【AlertManager】从入门到精通

【Alerting】【AlertManager】从入门到精通

https://www.jianshu.com/p/b9dcdaa117c7

1.简介:

  • Alertmanager和Prometheus密不可分,是Prometheus的模块之一,不过需要独立安装
  • 本文使用rpm安装,版本:0.15.3-1.el7.centos

2.链接:

2.1. 参考文献

3. 架构图:

 
image.png

4. 部署:

4.1. rpm安装:(4.1和4.2二选一)

  • 好心人打的包 /etc/yum.repos.d/prometheus.repo
[prometheus]
name=prometheus
baseurl=https://packagecloud.io/prometheus-rpm/release/el/7/$basearch
repo_gpgcheck=1
enabled=1
gpgkey=https://packagecloud.io/prometheus-rpm/release/gpgkey
       https://raw.githubusercontent.com/lest/prometheus-rpm/master/RPM-GPG-KEY-prometheus-rpm
gpgcheck=1
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
  • 运行 yum install -y alertmanager
  • 安装完的结构如下
/etc/default/alertmanager # systemd的环境变量 /etc/prometheus/alertmanager.yml # Alertmanager的主配置文件 /usr/bin/alertmanager # Alertmanager的启动文件 /usr/bin/amtool # 查看报警的工具程序 /usr/lib/systemd/system/alertmanager.service # systemd的入口程序 /var/lib/prometheus # 库文件 
  • systemctl start alertmanager && systemctl enable alertmanager

4.2. 二进制包安装:(4.1和4.2二选一)

  • 下载二进制包,wget https://github.com/prometheus/alertmanager/releases/download/v0.15.3/alertmanager-0.15.3.linux-amd64.tar.gz
  • 解压 tar xf alertmanager-0.15.3.linux-amd64.tar.gz -C /opt
~]# ll /opt/alertmanager-0.15.3.linux-amd64/ total 31200 -rwxr-xr-x 1 3434 3434 19998160 Nov 9 16:41 alertmanager # Alertmanager的启动文件 -rw-r--r-- 1 3434 3434 380 Nov 9 17:00 alertmanager.yml # Alertmanager的主配置文件 -rwxr-xr-x 1 3434 3434 11923635 Nov 9 16:41 amtool # 查看报警的工具程序 -rw-r--r-- 1 3434 3434 11357 Nov 9 17:00 LICENSE -rw-r--r-- 1 3434 3434 457 Nov 9 17:00 NOTICE 
  • 直接运行./alertmanager就可以启动

5. 配置文件

5.1. /usr/lib/systemd/system/alertmanager.service

# -*- mode: conf -*-
[Unit]
Description=Prometheus Alertmanager.
Documentation=https://github.com/prometheus/alertmanager
After=network.target
[Service]
EnvironmentFile=-/etc/default/alertmanager
User=prometheus
ExecStart=/usr/bin/alertmanager \
          --config.file=/etc/prometheus/alertmanager.yml \
          --storage.path=/var/lib/prometheus/alertmanager \
          $ALERTMANAGER_OPTS
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
[Install]
WantedBy=multi-user.target

5.2. /etc/default/alertmanager

\color{#FF0000}{注意:}--web.external-url\color{#FF0000}{选项一定要加,prometheus发送告警邮件的时候回引用这个地址,如果不加,默认是机器名}

ALERTMANAGER_OPTS='\ 
--web.external-url=http://10.41.91.91:9093 \ # 被外部访问的地址,10.41.91.91是本机地址,其他服务器的配置请记得修改这个
--cluster.listen-address=10.41.91.91:9094 \ # 本机被集群监听的地址
--cluster.peer=10.41.91.91:9094 \ # 本机监听其他集群的地址
--cluster.peer=10.210.149.26:9094 \
--cluster.peer=10.210.149.27:9094'

5.3. /etc/prometheus/alertmanager.yml

global: # 全局配置
  resolve_timeout: 5m # 解决报警时间间隔 route: # 分发的规则 group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1h receiver: 'web.hook' receivers: # 接受者,可以是邮箱,wechat或者web接口等等 - name: 'web.hook' webhook_configs: - url: 'http://127.0.0.1:5001/' inhibit_rules: # 抑制的规则 - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] 

6. 管理工具

6.1. amtool

amtool alert --alertmanager.url=http://localhost:9093
Alertname    Starts At                Summary
RootfsUsage  2019-01-10 07:13:32 CET  Not enough space for root fs on 10.210.54.227:9100
RootfsUsage  2019-01-11 14:36:17 CET  Not enough space for root fs on 10.210.54.226:9100
MemoryUsage  2019-01-17 00:44:17 CET  Memory of instance 150.132.195.26:9100 is not enough

6.2. web UI

http://你的服务器IP:9093

 
image.png

猜你喜欢

转载自www.cnblogs.com/jinanxiaolaohu/p/12216492.html