Linux cloud computing architecture-build prometheus+grafana cloud platform monitoring system

Linux cloud computing architecture-build prometheus+grafana cloud platform monitoring system

1. Introduction to prometheus

Prometheus (Prometheus) is an open source monitoring system that combines monitoring, alarm, and time series database. Commonly used in doker, mesos, kubernetes container management systems.

Monitoring principle : httpObtain the status of the monitored component periodically through the protocol. The monitored component only needs to provide the corresponding http interface, without any SDK or other integration process.

The http interface that the monitored component provides status information is called exporter.
At present, most of the components of Internet companies exportercan be used directly.

Compared with the monitoring interface of nagios and zabbix, the monitoring interface of prometheus is more beautiful.

2. Prometheus architecture principle

The promethues architecture is the overall architecture of promethues and some of its ecosystem components.

2.1 Prometheus architecture composition

Insert picture description here

prometheus server : is the core part of the prometheus component, responsible for obtaining monitoring data, storing and querying it. The monitoring targets can be managed through static configuration, or the monitoring targets can be dynamically managed by configuring service discovery (service discovery), and data can be obtained from these targets. The prometheus server is a real-time database that stores the collected monitoring data in a time series to the local disk. Provide customized promQL to realize data query and analysis. The prometheus server can also obtain data from other prometheus server instances.

exporter : The monitored component provides an interface for monitoring data, so that the monitoring data is displayed to the prometheus server in the form of an http service. The prometheus server can obtain monitoring data by accessing the endpoint port provided by the exporter.
exporterGenerally divided into two categories:
① Direct collection : support prometheus, directly collect the information of the monitored component. ②Indirect
collection : Prometheus is not directly supported, and the data of the monitored component is collected through the monitoring collection program for the monitoring component provided by Prometheus.

alertmanager : Prometheus server supports creating alert rules based on promQL. If the rules defined by promQL are met, an alert will be generated. Commonly accepted methods are: email, webhook. [Similar to the trigger in zabbix]

Pushgateway : Prometheus data collection is based on the prometheus server pulling (pull) data from the exporter, which must maintain the communication between prometheus and the exporter. If the communication between the two is not allowed, pushgateway can be used to transfer, that is, the monitored component actively pushes data to pushgateway, and then pushgateway sends the data to the promethues server.

2.2 Prometheus architecture workflow

  1. promethues serverGet metrics(index)
    ① from the configured jobsand exporterpulls metrics
    ② acceptable pushgatewaysent from metrics
    ③ from another promethues serverto take the pullmetrics

  2. promethues serverProcessing metrics
    ①local storage metrics
    ②run defined alerts.rules
    ③satisfy the rules, alertmanagerpush an alert; not meet the rules, only record the new time series.

  3. alertmanagerHandling alarms

    ①According to the configuration file, process the received alarm and issue an alarm.

  4. The graphical interface visualizes the data.

3. Deploy the promethues cloud platform monitoring system

Topology diagram of prometheus cloud platform monitoring system:
Insert picture description here

3.1 Download the installation package

# 下载以下6个tar包,国内的网不一定下的到。
# 下载地址:https://prometheus.io/download/
# premetheus server和alertmanager下载
https://github.com/prometheus/prometheus/releases/download/v2.20.1/prometheus-2.20.1.linux-amd64.tar.gz

https://github.com/prometheus/alertmanager/releases/download/v0.21.0/alertmanager-0.21.0.linux-amd64.tar.gz

# mysqld_exporter、node_exporter、pushgateway下载
https://github.com/prometheus/mysqld_exporter/releases/download/v0.12.1/mysqld_exporter-0.12.1.linux-amd64.tar.gz

https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz

https://github.com/prometheus/pushgateway/releases/download/v1.2.0/pushgateway-1.2.0.linux-amd64.tar.gz

# grafana下载
https://dl.grafana.com/oss/release/grafana-7.1.3-1.x86_64.rpm

# 这个地址比较快
https://mirrors.huaweicloud.com/grafana/7.1.3/grafana-7.1.3.linux-amd64.tar.gz

[root@master prometheus]# pwd
/root/prometheus
[root@master prometheus]# ls
alertmanager-0.21.0.linux-amd64.tar.gz     node_exporter-1.0.1.linux-amd64.tar.gz
grafana-7.1.3.linux-amd64.tar.gz           prometheus-2.20.1.linux-amd64.tar.gz
mysqld_exporter-0.12.1.linux-amd64.tar.gz  pushgateway-1.2.0.linux-amd64.tar.gz

3.2 deploy prometheus server

# 安装epel扩展源
[root@master ~]# yum install epel-release -y

# 安装go语言环境
# go(golang)是goolge开发的一种静态强类型、编译型、并发型、具有垃圾回收功能的编程语言。
[root@master ~]# yum install go -y
[root@master ~]# go version
go version go1.13.14 linux/amd64

# 直接解压到/usr/local/目录下即可使用
[root@master ~]# tar xzf /root/prometheus/prometheus-2.20.1.linux-amd64.tar.gz -C /usr/local/
[root@master ~]# ll -d /usr/local/prometheus-2.20.1.linux-amd64/
drwxr-xr-x. 4 3434 3434 144 8月   6 03:42 /usr/local/prometheus-2.20.1.linux-amd64/

# 修改prometheus的配置文件,最后加即可。
# yml语法注重缩进
vim prometheus.yml
  - job_name: system-status              # 监控项,监控系统状态
    static_configs:          
      - targets: ['192.168.8.178:9100']  # 监控主机IP和端口号
        labels:
          instance: server-status
  - job_name: mysql-status               # 监控mysql状态
    static_configs:
      - targets: ['192.168.8.178:9104']  #监控主机IP和端口号
        labels:
          instance: server-mysql         # 实例名,用于grafana

# 启动prometheus,并放到后台运行
[root@master ~]# cd /usr/local/prometheus-2.20.1.linux-amd64/
[root@master prometheus-2.20.1.linux-amd64]# ./prometheus --config.file=prometheus.yml &

===========================================
&              前台进程放到后台运行
jobs           查看当前的后台进程
fg 后台序号     后台进程放到前台运行
ctrl+z         前台进程放到后台运行,并停止
bg             后台进程继续运行
ctrl+c         退出前台进程
===========================================

# 如果不用&,表示前台运行。
# 看到下面这句就说明启动成功了
level=info ts=2020-08-17T14:17:25.259Z caller=main.go:652 msg="Server is ready to receive web requests."
# 按了ctrl+c退出,即关闭了。
level=info ts=2020-08-17T14:21:12.348Z caller=main.go:767 msg="See you next time!"

# 由于prometheus服务器占用9090号端口,故防火墙要开放9090号端口
[root@master ~]# firewall-cmd --permanent --zone=public --add-port=9090/tcp
success
[root@master ~]# firewall-cmd --reload 
success

# 查看prometheus的web界面
http://192.168.8.177:9090/

Insert picture description here
Since the exporter of node and mysql has not been configured, no data can be obtained.
Insert picture description here

[root@master prometheus-2.20.1.linux-amd64]# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:9090']
    
  - job_name: system-status              # 监控项,监控系统状态
    static_configs:          
      - targets: ['192.168.8.178:9100']  # 监控主机IP和端口号
        labels:
          instance: server-status
  - job_name: mysql-status               # 监控mysql状态
    static_configs:
      - targets: ['192.168.8.178:9104']  #监控主机IP和端口号
        labels:
          instance: server-mysql         # 实例名,用于grafana

# 监控多个IP地址
['192.168.8.100:9100','192.168.8.101:9100']

3.3 deploy exporter (export node)

Common Exporter:
node_exportermonitoring system performance and operational status
mysqld_exportermonitoring mysql database services
snmp_exportermonitor network devices

# 在被监控主机上配置以下2个exporter出口节点,即接口。用于采集被监控主机上对应的状态。
# 传输tar包
[root@master prometheus]# scp /root/prometheus/node_exporter-1.0.1.linux-amd64.tar.gz 192.168.8.178:/opt/
[root@master prometheus]# scp /root/prometheus/mysqld_exporter-0.12.1.linux-amd64.tar.gz 192.168.8.178:/opt/

1. 配置node_exporter
[root@client ~]# tar xzf /opt/node_exporter-1.0.1.linux-amd64.tar.gz -C /usr/local/
[root@client ~]# /usr/local/node_exporter-1.0.1.linux-amd64/node_exporter &
[root@client ~]# netstat -antup | grep 9100
tcp6       0      0 :::9100                 :::*                    LISTEN      10098/node_exporter 
tcp6       0      0 192.168.8.178:9100      192.168.8.177:52894     ESTABLISHED 10098/node_exporter 

2. 配置mysqld_exporter
# 如果有类似报错:Access denied for user 'zabbix'@'localhost' (using password: YES)
# 估计要重置对应用户密码。
[root@client ~]# mysql -uroot -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 5.5.60-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [mysql]> grant replication client,process on *.* to 'mysql_monitor'@'localhost' identified by '123456';
Query OK, 0 rows affected (0.05 sec)

MariaDB [mysql]> grant select on *.* to 'mysql_monitor'@'localhost';
Query OK, 0 rows affected (0.00 sec)

MariaDB [mysql]> exit
Bye

# 解压mysql_exportor
[root@client ~]# tar xzf /opt/mysqld_exporter-0.12.1.linux-amd64.tar.gz -C /usr/local/
# 使用指定数据库用户进行收集数据
[root@client ~]# vim /usr/local/mysqld_exporter-0.12.1.linux-amd64/.my.cnf
[root@client ~]# cat /usr/local/mysqld_exporter-0.12.1.linux-amd64/.my.cnf 
[client]
user=mysql_monitor
password=123456
# 启动mysql_exportor,看到以下内容说明启动成功。
[root@client ~]# /usr/local/mysqld_exporter-0.12.1.linux-amd64/mysqld_exporter --config.my-cnf="/usr/local/mysqld_exporter-0.12.1.linux-amd64/.my.cnf" &
[1] 20812
[root@client ~]# INFO[0000] Starting mysqld_exporter (version=0.12.1, branch=HEAD, revision=48667bf7c3b438b5e93b259f3d17b70a7c9aff96)  source="mysqld_exporter.go:257"
INFO[0000] Build context (go=go1.12.7, user=root@0b3e56a7bc0a, date=20190729-12:35:58)  source="mysqld_exporter.go:258"
INFO[0000] Enabled scrapers:                             source="mysqld_exporter.go:269"
INFO[0000]  --collect.global_status                      source="mysqld_exporter.go:273"
INFO[0000]  --collect.global_variables                   source="mysqld_exporter.go:273"
INFO[0000]  --collect.slave_status                       source="mysqld_exporter.go:273"
INFO[0000]  --collect.info_schema.innodb_cmp             source="mysqld_exporter.go:273"
INFO[0000]  --collect.info_schema.innodb_cmpmem          source="mysqld_exporter.go:273"
INFO[0000]  --collect.info_schema.query_response_time    source="mysqld_exporter.go:273"
INFO[0000] Listening on :9104                            source="mysqld_exporter.go:283

As you can see, the prometheus server can already obtain system status information and mysql database information from the remote host. All are up.
Insert picture description here

# 查看下端口,监听正常
[root@client ~]# netstat -antup | grep 9100
tcp6       0      0 :::9100                 :::*                    LISTEN      10098/node_exporter 
tcp6       0      0 192.168.8.178:9100      192.168.8.177:50488     ESTABLISHED 10098/node_exporter 
[root@client ~]# netstat -antup | grep 9104
tcp6       0      0 :::9104                 :::*                    LISTEN      20812/mysqld_export 
tcp6       0      0 192.168.8.178:9104      192.168.8.177:39890     ESTABLISHED 20812/mysqld_export 

So far, the prometheus cloud platform monitoring system has been deployed.

4. Use grafana to beautify prometheus

If the monitor is looking at the above interface, how can it be said that the interface of prometheus is better than zabbix? At this time, you need to use the grafana beautification tool to beautify it.

4.1 A dress grafana

[root@master ~]# ll /root/prometheus/grafana-7.1.3.linux-amd64.tar.gz 
-rw-r--r--. 1 root root 52611015 8月   6 12:55 /root/prometheus/grafana-7.1.3.linux-amd64.tar.gz
[root@master ~]# tar xzf /root/prometheus/grafana-7.1.3.linux-amd64.tar.gz -C /usr/local/
[root@master ~]# cd /usr/local/grafana-7.1.3/
# 后台运行grafana
[root@master grafana-7.1.3]# ./bin/grafana-server &
# 可以看到后台运行的已经有两个程序了。
[root@master grafana-7.1.3]# jobs
[1]-  运行中               ./prometheus --config.file=prometheus.yml &(工作目录:/usr/local/prometheus-2.20.1.linux-amd64)
[2]+  运行中               ./bin/grafana-server &

# 防火墙开放3000端口号
[root@master grafana-7.1.3]# firewall-cmd --permanent --zone=public --add-port=3000/tcp
success
[root@master grafana-7.1.3]# firewall-cmd --reload 
success

# 还是不懂的,可以参考以下网址:
https://grafana.com/docs/grafana/latest/installation/rpm/

Visit:, http://192.168.8.177:3000/loginyou can see the following.
Initial account password: admin/ admin
Insert picture description here
Modify initial password: The
Insert picture description here
login page is as follows:
Insert picture description here

4.2 Beautify prometheus

Open the grafana interface with ie there is a problem with compatibility. I use Google to show here.
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Add dashboards for system status information and mysql database information:
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Through the above search, you can see that the IDs of the two dashboards are: 8919 7362
Dashboard for importing system status: Dashboard for
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
importing mysql database information: The
Insert picture description here
Insert picture description here
Insert picture description here
above is the system status and The monitoring panel of mysql database status information is now available.

5. Effect picture

5.1 System status

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

5.2 mysql database status

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

6. Adjust the dashboard monitoring panel

After importing the dashboard template, it is best not to delete it. First understand the monitoring content in the dashboard, and then selectively delete it. Because want to add is to write a query, it is not easy to write.
Insert picture description here
The above is the content of the prometheus monitoring system. For some other monitoring actual combat, you can pay attention to the blogger, and there will be follow-up.

Guess you like

Origin blog.csdn.net/weixin_36522099/article/details/108066934