Prometheus + grafana deployment uses
Host List:
192.168.161.130 : Prometheus
192.168.161.128 : node-1
192.168.161.129 : node-2
Binary manual deployment
Extracting prometheus-2.14.0.linux-amd64.tar.gz packet
decompression discharge to the / usr / local
- Service startup script editor
vim /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus server daemon
[Service]
ExecStart=/usr/local/prome/prometheus --config.file=/usr/local/prome/prometheus.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=42s
[Install]
WantedBy=multi-user.target
systemctl daemon-reload # 加载后台服务
systemctl start | stop | restart | status prometheus
Server monitoring profile contents
global: # 全局配置段
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'codelab-monitor'
rule_files: # 告警规则配置
scrape_configs: # 监控配置
- job_name: 'prometheus' #监控任务名称
static_configs: # 静态监控配置
- targets: ['192.168.235.130:9090']
labels:
node: prome_server
With Dynamic Discovery Add Host
global: # 全局配置段
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'codelab-monitor'
rule_files: # 告警规则配置
scrape_configs: # 监控配置
- job_name: 'prometheus' #监控任务名称
# static_configs: # 静态监控配置
# - targets: ['192.168.235.130:9090']
# labels:
# node: prome_server
file_sd_configs: # 基于配置文件动态发现添加主机
- files: ['/usr/local/prome/sd_config/*.yml'] # 定义服务发现配置路径
refresh_interval: 5s # 5秒发现刷洗发现一次
- Create a service discovery profile
mkdir /usr/local/prome/sd_config
- Edit prometheus monitor configuration file
- targets: # 直接指定动态发现下的 targets 目标主机
- 192.168.235.130:9090
labels: # 给目标主机添加标签标识
node: prome_server
- Overload Services
systemctl reload prometheus
Monitored data acquisition tools mounted node node-exporter
- Decompression
[root@localhost ~]# tar zxf node_exporter-0.18.1.linux-amd64.tar.gz
[root@localhost ~]# mv node_exporter-0.18.1.linux-amd64 /usr/local/
- Service management system will be made operational service
[Unit]
Description=Node_exporter server daemon
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
prometheus Add Remote Host Monitoring
Add the task in the original prometheus.yml, the monitoring remote host
# my global config
global:
evaluation_interval: 15s
scrape_interval: 15s
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['192.168.235.130:9090']
labels:
node: prome_server
# file_sd_configs:
# - files: ['/usr/local/prome/sd_config/*.yml']
# refresh_interval: 5s
- job_name: 'node-1' # 添加主机node-1
static_configs:
- targets: ['192.168.235.128:9100']
labels:
node: node-1
- job_name: 'node-2' # 添加主机node-2
static_configs:
- targets: ['192.168.235.129:9100']
labels:
node: node-2
Add the node_exporter service monitoring
node_exporter can also service status monitoring system, the startup parameter is specified to enable service monitoring--collector.systemd --collector.systemd.unit-whitelist=服务名称
[Unit]
Description=Node_exporter server daemon
[Service]
ExecStart=/usr/local/node_exporter/node_exporter --collector.systemd --collector.systemd.unit-whitelist=(sshd|docker|nginx).service
Restart=on-failure
[Install]
WantedBy=multi-user.target
- --collector.systemd: Enable express service monitoring
- --collector.systemd.unit-whitelist = (sshd | docker | nginx) .service: representation monitors multiple service
grafan quickly use monitoring templates to quickly monitor
grafana offers a number of monitoring templates for a variety of data sources and exporter produced a different monitoring templates can be used directly in the lead grafana
https://grafana.com/grafana/dashboards
- Directly into the ID value monitoring templates
Select a monitoring template, find the Get this dashboard:
value, there is a good template for the ID template node-exporter of: 8919
, it can quickly realize monitor host resources.
Similarly, container monitoring, too. You can find the right template