Table of contents
3. Environment preparation... 2
4. Install Prometheus and Grafana. 3
5. Configure Prometheus and Grafana. 4
Six, monitor machine hardware resources... 7
7. Monitoring basic services... 9
8. Monitoring application... 18
Nine, monitoring business interface data... 19
①/data/prometheus/rules/node.yml 24
② /data/prometheus/rules/redis.yml 26
③/data/prometheus/rules/mysql.yml 27
④/data/prometheus/rules/nginx.yml 29
Use Prometheus+Grafana to build a monitoring system. The main monitoring content includes machine hardware resources, basic services, applications, and business interface data.
Prometheus - Monitoring system & time series database
Prometheus is an open source service monitoring system and time series database. The Prometheus ecosystem consists of multiple components, including the Prometheus Server responsible for data collection and storage and providing PromQL query language support, providing multi-language client SDKs, the intermediate gateway Push Gateway that supports temporary job active push indicators, and the data collection component Exporter , which is responsible for collecting data from the target and converting it into a format supported by Prometheus and an Alertmanager that provides an alert function.
The difference between the Promethues Exporter component and the traditional data collection component is that it does not send data to the central server, but waits for the central server to take the initiative to grab it. Prometheus provides various types of exporters to collect the running status of various services.
Ecosystem architecture diagram provided by Promethues official website:
Grafana: The open observability platform | Grafana Labs
Grafana is a cross-platform open source measurement analysis and visualization tool that supports data acquisition from multiple data sources (such as prometheus) for visual data display.
&&& Monitor the hardware resources of the server: cpu 8 cores, memory 32G, disk 250G, network card.
Prometheus and other monitoring service deployment arrangements are as follows:
1. Check the operating system version
[root@node64 ~]# cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[root@node64 ~]# getconf LONG_BIT
64
2. Check the network card IP and configuration
[root@node64 ~]# ip a
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether fa:16:3e:a3:89:25 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.91/24 brd 192.168.0.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fea3:8925/64 scope link
valid_lft forever preferred_lft forever
[root@node64 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
NAME=eth1
TYPE=Ethernet
BOOTPROTO=dhcp
DEVICE=eth1
ONBOOT=yes
IPV4_ROUTE_METRIC=100
4. Install Prometheus and Grafana
1. Create the installation directory mkir /data/prometheus
groupadd prometheus
useradd -g prometheus -s /sbin/nologin prometheus
chown -R prometheus:prometheus prometheus
2. Download prometheus, grafana installation package and prometheus plug-in package,
包括node_exporter、mysqld_exporter、nginx-vts-exporter、redis_exporter、alertmanager
The Prometheus installation package reference is as follows:
The Grafana installation package reference is as follows:
wget https://dl.grafana.com/oss/release/grafana-7.1.1.linux-amd64.tar.gz
3. Unzip and install
5. Configure Prometheus and Grafana
1. Configure prometheus
a. Modify the configuration file prometheus.yml
vi /data/prometheus/prometheus/prometheus.yml
scrape_configs:
metrics_path: /prometheus/metrics
static_configs:
- targets: ['192.168.0.91:9090']
b. Check the configuration file
/data/prometheus/prometheus
[root@node64 prometheus]# ./promtool check config prometheus.yml
Note: Make the prometheus configuration effective pgrep -fl prometheus
c. Register prometheus as a system service
[root@node64 prometheus]# cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
User=root
ExecStart=/data/prometheus/prometheus/prometheus --web.external-url=prometheus --web.enable-admin-api --config.file=/data/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/data/prometheus/prometheus/data --storage.tsdb.retention=15d --log.level=info --web.enable-lifecycle
Restart=on-failure
[Install]
WantedBy=multi-user.target
d. Start and view the prometheus service
systemctl enable prometheus
systemctl start prometheus
systemctl status prometheus
[root@node64 prometheus]# netstat -anp | grep 9090
e. nginx forwards the prometheus service
Prometheus' Node Exporter does not provide any authentication support. However, with Nginx as a reverse proxy server, we can easily add HTTP Basic Auth functionality to Node Exporter.
yum -y install httpd
[root@node23 conf]# htpasswd -c .htpasswd_prometheus Prometheus
Iampwd
location /prometheus/ {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-Nginx-Proxy true;
proxy_pass http://192.168.0.91:9090;
proxy_redirect off;
proxy_buffering off;
proxy_read_timeout 90;
proxy_send_timeout 90;
auth_basic "Prometheus";
auth_basic_user_file ".htpasswd";
}
f. web access prometheus
https://***.com:9091/prometheus/
prometheus/Iampwd
2. Configure grafana
a. Modify the configuration file default.ini
vi /data/prometheus/grafana/conf/defaults.ini
http_port = 3000
root_url = %(protocol)s://%(domain)s:%(http_port)s/grafana/
b. Register grafana as a system service
[root@node64 conf]# cat /usr/lib/systemd/system/grafana-server.service
[Unit]
Description=Grafana
After=network.target
[Service]
Type=notify
ExecStart=/data/prometheus/grafana/bin/grafana-server -homepath /data/prometheus/grafana
Restart=on-failure
[Install]
WantedBy=multi-user.target
c. Start and view the grafana service
systemctl enable grafana-server
systemctl start grafana-server
systemctl status grafana-server
d. nginx forwards the grafana service
location /grafana/ {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-Nginx-Proxy true;
proxy_pass http://192.168.0.91:3000/;
proxy_redirect off;
proxy_buffering off;
proxy_read_timeout 90;
proxy_send_timeout 90;
}
e. web access grafana and configure data source
admin/Iampwd
Data source URL: http://192.168.0.91:9090/prometheus
6. Monitor machine hardware resources
Prometheus uses the node_exporter plug-in to monitor the hardware resources of the machine, and prometheus actively grabs the required data from the network intercommunication machine where the node_exporter service is installed
1. Install node_exporter on the machine that needs to be monitored
2. Register node_exporter as a system service
[root@node64 conf]# cat /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/data/prometheus/node_exporter/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
3. Start and view the node_exporter service
systemctl enable node_exporter
systemctl start node_exporter
systemctl status node_exporter
4. Modify prometheus.yml and restart the prometheus service
scrape_configs:
- job_name: 'node_exporter'
static_configs:
-targets:['192.168.0.91:9100','192.168.0.92:9100'…]
5. Visit prometheus to view the monitoring status
6. Introduce grafana monitoring panel
node_exporter 8919
mysql_exporter 11323
mysql overview 7362
nginxvts 2949
redis 11835
Seven, monitoring basic services
1. Monitor NGINX
Nginx obtains certain index data of nginx through the nginx-module-vts module, and Prometheus collects nginx information through the nginx-vts-exporter component.
a. Install the nginx-module-vts module on the nginx server
- https://github.com/vozlt/nginx-module-vts - (choose zip package to download)
- Compile and install nginx
./configure --prefix= /data/nginx --with-http_gzip_static_module --with-http_stub_status_module --with-http_ssl_module --with-pcre --with-file-aio --with-http_realip_module --add-module=/data/nginx-module-vts
make && make install
- Modify the nginx.conf file
http{
vhost_traffic_status_zone;
vhost_traffic_status_filter_by_host on;
location /status {
vhost_traffic_status_display;
vhost_traffic_status_display_format html;
}
}
b. Download and install the nginx-vts-exporter plug-in on both the nginx server and the prometheus server
wget
c. Register nginx-vts-exporter as a system service on the nginx server (192.168.0.71)
cat /etc/systemd/system/nginx-vts-exporter.service
[Unit]
Description=nginx_exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/data/nginx-vts-exporter/nginx-vts-exporter -nginx.scrape_uri=https://公网IP:9091/status/format/json
Restart=on-failure
[Install]
WantedBy=multi-user.target
d. Register nginx-vts-exporter as a system service on the prometheus server ( 192.168.0.91 )
cat /etc/systemd/system/nginx-vts-exporter.service
[Unit]
Description=nginx_exporter
After=network.target
[Service]
Type=simple
User=root
ExecStart=/data/prometheus/nginx-vts-exporter/nginx-vts-exporter -nginx.scrape_uri=https://192.168.0.71:9091/status/format/json
Restart=on-failure
[Install]
WantedBy=multi-user.target
e. Start and view the nginx-vts-exporter service on the nginx server and prometheus server
systemctl enable nginx-vts-exporter
systemctl start nginx-vts-exporter
systemctl status nginx-vts-exporter
f. Modify the configuration file prometheus.yml on the prometheus server and restart the prometheus service
- job_name: 'nginx'
static_configs:
- targets: ['192.168.0.91:9913']
https://public network IP:9091/status g, view nginx page monitoring
Import nginx monitoring panel nginx-vts-exporter 2949 on grafana and view the panel
https://***.com:9091/grafana/d/5-RKCVxGk/nginx-vts-stats?orgId=1
2. Monitor MYSQL
Prometheus collects data related to MySQL master and slave servers through the mysqld_exporter component.
1) After installing mysql using an automated script, add the mysql service to the system service and set the boot to start automatically
[root@centos7-min4 nginx]# cp /opt/mysql57/support-files/mysql.server /etc/rc.d/init.d/mysqld
chmod +x /etc/init.d/mysqld
chkconfig --add mysqld
chkconfig --list
# systemctl start mysqld
# systemctl status mysqld
[mysql@centos7-min4 nginx]$ mysql -uroot -p —— 123456
mysql> select version();
+------------+
| version() |
+------------+
| 5.7.24-log |
2) Install the mysqld_exporter component on the prometheus server
Prometheus monitors mysql master-slave server
-
- Log in to mysql to create an account for the exporter and authorize it
create user 'exporter'@'192.168.0.%' identified by 'Abc123';
grant process,replication client,select on *.* to 'exporter'@'192.168.0.%';
flush privileges;
-
- Install the mysqld_exporter service on the Prometheus server and monitor the mysql master-slave service at the same time
ls -al /data/prometheus/mysqld_exporter/
.my-master.cnf
.my-slave.cnf
root@node64 mysqld_exporter]# cat .my-master.cnf
[client]
user=exporter
password=Abc123
host=192.168.0.92
port=3306
[root@node64 mysqld_exporter]# cat .my-slave.cnf
[client]
user=exporter
password=Abc123
host=192.168.0.93
port=3306
-
- Start the mysqld_exporter service
Start a service for the mysql master and slave services respectively:
Mysql master service starts
/data/prometheus/mysqld_exporter/mysqld_exporter --web.listen-address=192.168.0.91:9104 --config.my-cnf=/data/prometheus/mysqld_exporter/.my-master.cnf --collect.auto_increment.columns --collect.binlog_size --collect.global_status --collect.engine_innodb_status --collect.global_variables --collect.info_schema.innodb_metrics --collect.info_schema.innodb_tablespaces --collect.info_schema.innodb_cmp --collect.info_schema.innodb_cmpmem --collect.info_schema.processlist --collect.info_schema.query_response_time --collect.info_schema.tables --collect.info_schema.tablestats --collect.info_schema.userstats --collect.perf_schema.eventswaits --collect.perf_schema.file_events --collect.perf_schema.indexiowaits --collect.perf_schema.tableiowaits --collect.perf_schema.tablelocks
Mysql starts from service
/data/prometheus/mysqld_exporter/mysqld_exporter --web.listen-address=192.168.0.91:9105 --config.my-cnf=/data/prometheus/mysqld_exporter/.my-slave.cnf
(Note: Please keep the other parameters consistent with the above Mysql main service startup)
3) Modify the prometheus configuration file information and restart prometheus
prometheus.yml
- job_name: 'mysql_exporter'
static_configs:
# - targets: ['192.168.0.92:9104','192.168.0.93:9104']
- labels:
instance: master:3306 # The alias of the instance displayed by grafana
- targets:
- 192.168.0.91:9104 # The port exposed by mysqld_exporter
- labels:
instance: slave:3306 # The alias of the instance displayed by grafana
- targets:
- 192.168.0.91:9105 # The port exposed by mysqld_exporter
4) View the mysql data of the prometheus and grafana panels, and import the mysql monitoring panel
mysql_exporter 11323
mysql overview 7362
PS: mysql synchronization fault handling: Slave_SQL_Running: No
Analysis: Causes of mysql data synchronization failure
- The program may have performed a write operation on the slave
- It may be caused by the transaction rollback after the slave machine is restarted
Solution: first stop the slave service, check the status of the host on the master server, and synchronize to the slave server according to the values corresponding to File and Position, and finally start the slave service to check the synchronization status
Main server:
mysql> show master status;
From the server:
mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql>
mysql> change master to master_host='192.168.0.92',
-> master_user='repl',
-> master_password='123456',
-> master_log_file='mysql-bin-T-prod-3306.000005',
-> master_log_pos=653020;
Query OK, 0 rows affected, 2 warnings (0.05 sec)
mysql> start slave;
3. Monitor REDIS
Use the redis_exporter component to monitor the redis cluster with three masters and three slaves.
1) Use automated scripts to install redis three-master and three-slave clusters
2) Download and install the redis_exporter service on the prometheus server
wget https://github.com/oliver006/redis_exporter/releases/download/v1.3.5/redis_exporter-v1.3.5.linux-amd64.tar.gz
3) Monitor one of the servers in the redis cluster to monitor the entire created cluster
cd /data/prometheus/redis_exporter
./redis_exporter -redis.addr 192.168.0.93:7000 -redis.password 'zxcvb123' &
4) Modify the prometheus configuration file information and restart prometheus
- job_name: 'redis_exporter_targets'
static_configs:
- targets:
- redis://192.168.0.3:7000
- redis://192.168.0.2:7003
- redis://192.168.0.72:7002
- redis://192.168.0.35:7001
- redis://192.168.0.14:7004
- redis://192.168.0.13:7005
metrics_path: /scrape
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.0.91:9121
- job_name: 'redis_exporter'
static_configs:
- targets:
- 192.168.0.91:9121
5) View prometheus and grafana panel data, import panel redis 11835
Prometheus monitors the application using the node_exporter component
1) Install the process-exporter component on the application server that needs to be monitored
wget
https://github.com/ncabatoff/process-exporter/releases/download/v0.5.0/process-exporter-0.5.0.linux-amd64.tar.gz
2) Configure application monitoring information
process-conf.yml
3) Start the application monitoring service and specify the configuration file
./process-exporter -config.path process-conf.yml &
4) Modify prometheus configuration information and restart prometheus
[root@node64 prometheus]# vi prometheus.yml
- job_name: process
static_configs:
- targets: ['192.168.0.35:9256','192.168.0.13:9256'…]
5) View prometheus and grafana panel data
The dashboard corresponding to process-exporter is: Named processes | Grafana Labs
9. Monitoring business interface data
Configure grafana data display according to the business monitoring interface data provided by the development.
https://***.com:9092/api
- Notice:
1. Grafana does not save Prometheus data. It queries Prometheus and displays the UI. In this case you have to look at clearing prometheus data.
Prometheus has a retention period of 15 days by default. But this can be tuned to suit your needs with the -storage.local.retention flag
2. Prometheus will set the expired data to NaN, and the sum() summation function does not support NaN, so the metric needs to be adjusted:
sum(st_invoke_count{app_id=~'$appid',road_type='1'}>0)
3. Grafana global variables
4. Grafana menu cascade
1. Set the alarm mode
> Prometheus
Prometheus implements alerts through the component alertmanager. Alertmanager receives the alerts sent by prometheus and performs a series of processing on the alerts and sends them to specified users.
prometheus--->trigger threshold--->exceeding duration--->alertmanager--->group|suppress|silent--->media type--->mail, DingTalk, WeChat, etc.
1) Install alertmanager
2) Modify the alertmanager configuration file information alertmanager.yml
3) Open the smtp service
4) Configure the alarm notification template
[root@centos7-min4 alertmanager-0.21.0]# cat template/test.tmpl
{ { define "test.html" }}
<table border="5">
<tr>
<td>Alarm item</td>
<td>Matter</td>
<td>Alarm Threshold</td>
<td>Start time</td>
</tr>
{ { range $i, $alert := .Alerts }}
<tr>
<td>{ { index $alert.Labels "alertname" }}</td>
<td>{ { index $alert.Labels "instance" }}</td>
<td>{ { index $alert.Annotations "value" }}</td>
<td>{ { $alert.StartsAt }}</td>
</tr>
{ { end }}
</table>
{ { end }}
5) Start the alertmanager service
(1) Specify the configuration file to start
[root@centos7-min4 alertmanager-0.21.0]# ./alertmanager --config.file=alertmanager.yml &
(2) Configured as a system service startup
[root@centos7-min4 alertmanager-0.21.0]# cat /usr/lib/systemd/system/alertmanager.service
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/opt/alertmanager-0.21.0/alertmanager --config.file=/opt/alertmanager-0.21.0/alertmanager.yml
[Install]
WantedBy=multi-user.target
systemctl enable alertmanager
systemctl start alertmanager
systemctl status alertmanager
6) Modify the prometheus configuration file and restart prometheus
# Alertmanager configuration
alerting:
alert managers:
- static_configs:
- targets:
- 192.168.0.91:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "/data/prometheus/rules/node.yml"
- "/data/prometheus/rules/redis.yml"
- "/data/prometheus/rules/mysql.yml"
- "/data/prometheus/rules/nginx.yml"
- "/data/prometheus/rules/service-api.yml"
> Grafana
2. Set alarm rules
①/data/prometheus/rules/node.yml
groups:
- name: NodeProcess
rules:
- alert: NodeStatus
expr: up == 0
for: 1m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: The server is down"
description: "{ {$labels.instance}}: The server delay exceeds 5 minutes"
- alert: NodeFilesystemUsage
expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs"} / node_filesystem_size_bytes{fstype=~"ext4|xfs"} * 100) > 80
for: 2m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: { {$labels.mountpoint }} partition usage is too high"
description: "{ {$labels.instance}}: { {$labels.mountpoint }} partition usage is greater than 80% (current value: { { $value }})"
- alert: NodeMemoryUsage
expr: 100 - (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100 > 80
for: 2m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: memory usage is too high"
description: "{ { $labels.instance}}: memory usage is greater than 80% (current value: { { $value }})"
- alert: NodeCPUUsage
expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100) > 80
for: 2m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: CPU usage is too high"
description: "{ { $labels.instance}}: CPU usage is greater than 80% (current value: { { $value }})"
- alert: LoadCPU
expr: node_load5 > 5
for: 2m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: load is too high"
description: "{ { $labels.instance}}: The load average exceeds 5 within 5 minutes (current value: { { $value }})"
- alert: DiskIORead
expr: irate(node_disk_read_bytes_total{device="sda"}[1m]) > 30000000
for: 1m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: I/O read load is too high"
description: "{ {$labels.instance}}: I/O reading per minute has exceeded 30MB/s (current value: { { $value }})"
- alert: DiskIOWrite
expr: irate(node_disk_written_bytes_total{device="sda"}[1m]) > 30000000
for: 1m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: I/O write load is too high"
description: "{ {$labels.instance}}: I/O write per minute has exceeded 30MB/s (current value: { { $value }})"
- alert: incoming network bandwidth
expr: ((sum(rate (node_network_receive_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance)) / 100) > 18432
for: 1m
labels:
status: warning
annotations:
summary: "{ {$labels.mountpoint}} Incoming network bandwidth is too high!"
description: "{ {$labels.mountpoint }}Incoming network bandwidth is higher than 18M for 5 minutes. RX bandwidth usage { {$value}}"
- alert: Outgoing network bandwidth
expr: ((sum(rate (node_network_transmit_bytes_total{device!~'tap.*|veth.*|br.*|docker.*|virbr*|lo*'}[5m])) by (instance)) / 100) > 18432
for: 1m
labels:
status: warning
annotations:
summary: "{ {$labels.mountpoint}} Outgoing network bandwidth is too high!"
description: "{ {$labels.mountpoint }} Outgoing network bandwidth is higher than 18M for 5 minutes. RX bandwidth usage { {$value}}"
- alert: number of network connections
expr: node_sockstat_TCP_inuse > 240
for: 1m
labels:
status: warning
annotations:
summary: "{ {$labels.mountpoint}} The number of connections is too high!"
description: "{ {$labels.mountpoint }}Current connection number{ {$value}}"
② /data/prometheus/rules/redis.yml
groups:
- name: Redis
rules:
- alert: RedisDown
expr: redis_up == 0
for: 5m
labels:
severity: warning
annotations:
summary: "Redis down (instance { { $labels.instance }})"
description: "Redis cluster node failure\n VALUE = { { $value }}\n LABELS: { { $labels }}"
- alert: OutOfMemory
expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "Out of memory (instance { { $labels.instance }})"
description: "Redis is running out of memory (> 90%)\n VALUE = { { $value }}\n LABELS: { { $labels }}"
- alert: ReplicationBroken
expr: delta(redis_connected_slaves[1m]) < 0
for: 5m
labels:
severity: warning
annotations:
summary: "Replication broken (instance { { $labels.instance }})"
description: "Redis instance lost a slave\n VALUE = { { $value }}\n LABELS: { { $labels }}"
- alert: TooManyConnections
expr: redis_connected_clients > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Too many connections (instance { { $labels.instance }})"
description: "Redis instance has too many connections\n VALUE = { { $value }}\n LABELS: { { $labels }}"
- alert: RejectedConnections
expr: increase(redis_rejected_connections_total[1m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Rejected connections (instance { { $labels.instance }})"
description: "Some connections to Redis has been rejected\n VALUE = { { $value }}\n LABELS: { { $labels }}"
- alert: AofSaveStatus
expr: redis_aof_last_bgrewrite_status < 1
for: 5m
you can:
serverity: warning
annotations:
summary: "Missing backup (instance { { $labels.instance }})"
description: "Redis AOF persistence failed\n VALUE = { { $value }}\n LABELS: { { $labels }}"
③/data/prometheus/rules/mysql.yml
groups:
- name: MySQL
rules:
- alert: MySQL Status
expr: mysql_up == 0
for: 5s
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: MySQL has stop !!!"
description: "Detect the running status of the MySQL database"
- alert: MySQL Slave IO Thread Status
expr: mysql_slave_status_slave_io_running != 1
for: 5s
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: MySQL Slave IO Thread has stop !!!"
description: "Detect the running status of MySQL master-slave IO thread"
- alert: MySQL Slave SQL Thread Status
expr: mysql_slave_status_slave_sql_running != 1
for: 5s
labels:
severity: warning
annotations:
summary: '{ {$labels.instance}}: MySQL Slave SQL Thread has stop !!!'
description: "Detect the running status of MySQL master-slave SQL thread"
- alert: MySQL Slave Delay Status
expr: mysql_slave_status_sql_delay == 30
for: 5s
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: MySQL Slave Delay has more than 30s !!!"
description: "Detect MySQL master-slave delay status"
- alert: Mysql_Too_Many_Connections
expr: rate(mysql_global_status_threads_connected[5m]) > 200
for: 2m
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: too many connections"
description: "{ {$labels.instance}}: Too many connections, please deal with it,(current value is: { { $value }})"
④/data/prometheus/rules/nginx.yml
groups:
- name: nginx
rules:
- alert: Nginx Status
expr: up{instance="192.168.0.91:9913",job="nginx"} == 0
for: 5s
labels:
severity: warning
annotations:
summary: "{ {$labels.instance}}: Nginx has stop !!!"
description: "Detect abnormal running status of Nginx"