How to monitor the system that is running online? How to know whether the system is normal or abnormal? Prometheus is such a data monitoring solution. It allows operation and maintenance and developers to control the running status of the system at any time, quickly locate the location of the problem, and quickly troubleshoot. As long as we follow the Prometheus method and learn and deploy step by step, we can monitor the usage of CPU, memory and other resources of the machine, the running status of applications, and real-time data of various business indicators. And it can be used in conjunction with Grafana to present a very diverse chart configuration. For small and medium-sized teams, it can greatly reduce costs and speed up research and development.
I. Overview
Getting to Know Prometheus
Prometheus is an open source monitoring alarm system and time series database developed by SoundCloud. Literally understood, Prometheus consists of two parts, one is the monitoring and alarm system, and the other is the built-in time series database (TSDB).
In 2016, the Cloud Native Computing Foundation (CNCF) under the Linux Foundation initiated by Google included Prometheus as its second largest open source project. Prometheus is also very active in the open source community. It has more than 40,000 Stars and almost 8K Forks on GitHub, and the system will have a small version update every one or two weeks.
Prometheus Architecture
As can be seen from the above figure, the entire Prometheus can be divided into four parts, namely:
Prometheus server
Prometheus Server is the core part of Prometheus components, responsible for the acquisition, storage and query of monitoring data.
Process-exporter business data source
Business data sources push data to Prometheus Server through Pull/Push.
AlertManager Alert Manager
Prometheus configures the alarm rules. If the alarm rules are met, the alarm will be pushed to the AlertManager, which will handle the alarm.
Visual monitoring interface
After Prometheus collects the data, it will be displayed with visual icons on the WebUI interface. At present, we can display call data through a custom API client, or directly use the Grafana solution to display.
The implementation architecture of Prometheus is not complicated. In fact, it is to collect data, process data, visualize and display, and then analyze data for alarm processing. But its preciousness is that it provides a complete set of feasible solutions and forms an entire ecology, which can greatly reduce our research and development costs.
Construction of process monitoring service components
Let's introduce the components we need to build to implement process monitoring using Prometheus: Prometheus + Process-exporter + Grafana. Since it is currently in the intranet verification stage, all three components are installed on the same server (system: centos7.8).
Install Prometheus
First go to the official website to download the version corresponding to the system .
After downloading, upload it to the deployed server and unzip it:
tar -vxf prometheus-2.40.5.linux-amd64.tar.gz
Edit system service startup file
vi/usr/lib/systemd/system/prometheus.service
[Unit]
Description=prometheus
After=network.target
[Service]
Type=simple
ExecStart=/home/prometheus/prometheus/prometheus --config.file=/home/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/home/prometheus/prometheus/data --storage.tsdb.retention=30d --log.level=info --web.external-url=http://192.168.1.108:9090
Restart=on-failure
[Install]
WantedBy=multi-user.target
After saving, use systemctl to start/stop the Prometheus service
systemctl daemon-reload
systemctl enable prometheus
systemctl start prometheus
After startup, the web interface of Prometheus is shown in the figure below
1. Install process-exporter
Download process-exporter
wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.10/process-exporter-0.7.10.linux-amd64.tar.gz
Install and deploy process-exporter
tar -xvf process-exporter-0.7.10.linux-amd64.tar.gz
mv process-exporter-0.7.10.linux-amd64 process-exporter
Write a configuration file
cd process-exporter
vi config.yaml
process_names:
- name: "{
{.Matches}}"
cmdline:
- 'qtalk_api'
- name: "{
{.Matches}}"
cmdline:
- 'qtalk_user'
- name: "{
{.Matches}}"
cmdline:
- 'qtalk_auth'
Write a startup script
vi /usr/lib/system/system/process-exporter.service
[Unit]
Description=process_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/home/prometheus/process-exporter/process-exporter -config.path /home/prometheus/process-exporter/config.yaml
Restart=on-failure
[Install]
WantedBy=multi-user.target
start up
systemctl daemon-reload
systemctl enable process-exporter
systemctl start process-exporter
verify
curl 192.168.1.108:9256/metrics
Note: The metrics include: namedprocess_namegroup_num_procs{groupname="map[:qtalk_api]"} means that the startup is correct, otherwise, check whether the config.yaml configuration is correct.
Configure Prometheus
to add the following configuration at the end of the original prometheus.yml
- job_name: 'process'
static_configs:
-targets: ['192.168.1.108:9256']
Hot load Prometheus
./promtool check config prometheus.yml
systemctl reload prometheus.service
2. Install Grafana
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.1-1.x86_64.rpm
sudo yum install grafana-enterprise-9.3.1-1.x86_64.rpm
systemctl start grafana-server.service
After starting Grafana, log in using the grafana web interface, add the Prometheus data source, and then import the Grafana process monitoring template (https://grafana.com/grafana/dashboards/8378-system-processes-metrics/), you can Check the running status of the system process configured by process-exporter, as shown in the following figure:
Through the cooperation of Prometheus, Process-exporter, and Grafana, the status of each process running on the server can be displayed graphically, and with the alarm notification function of Prometheus, it is convenient for system operation and maintenance and developers to keep abreast of the system operating status , make resource adjustments in a timely manner to ensure high availability of the system.
The next article will introduce how to use the alarm component to match alarm rules and notify by email, DingTalk, and WeChat.
reference documents
prometheus
process-exporter
Prometheus — Process-exporter process monitoring