Technology sharing | How to use Prometheus to realize system process monitoring

How to monitor the system that is running online? How to know whether the system is normal or abnormal? Prometheus is such a data monitoring solution. It allows operation and maintenance and developers to control the running status of the system at any time, quickly locate the location of the problem, and quickly troubleshoot. As long as we follow the Prometheus method and learn and deploy step by step, we can monitor the usage of CPU, memory and other resources of the machine, the running status of applications, and real-time data of various business indicators. And it can be used in conjunction with Grafana to present a very diverse chart configuration. For small and medium-sized teams, it can greatly reduce costs and speed up research and development.

I. Overview

Getting to Know Prometheus

Prometheus is an open source monitoring alarm system and time series database developed by SoundCloud. Literally understood, Prometheus consists of two parts, one is the monitoring and alarm system, and the other is the built-in time series database (TSDB).

In 2016, the Cloud Native Computing Foundation (CNCF) under the Linux Foundation initiated by Google included Prometheus as its second largest open source project. Prometheus is also very active in the open source community. It has more than 40,000 Stars and almost 8K Forks on GitHub, and the system will have a small version update every one or two weeks.

Prometheus Architecture

insert image description here

As can be seen from the above figure, the entire Prometheus can be divided into four parts, namely:

Prometheus server

Prometheus Server is the core part of Prometheus components, responsible for the acquisition, storage and query of monitoring data.

Process-exporter business data source

Business data sources push data to Prometheus Server through Pull/Push.

AlertManager Alert Manager

Prometheus configures the alarm rules. If the alarm rules are met, the alarm will be pushed to the AlertManager, which will handle the alarm.

Visual monitoring interface

After Prometheus collects the data, it will be displayed with visual icons on the WebUI interface. At present, we can display call data through a custom API client, or directly use the Grafana solution to display.
The implementation architecture of Prometheus is not complicated. In fact, it is to collect data, process data, visualize and display, and then analyze data for alarm processing. But its preciousness is that it provides a complete set of feasible solutions and forms an entire ecology, which can greatly reduce our research and development costs.

Construction of process monitoring service components

Let's introduce the components we need to build to implement process monitoring using Prometheus: Prometheus + Process-exporter + Grafana. Since it is currently in the intranet verification stage, all three components are installed on the same server (system: centos7.8).

Install Prometheus

First go to the official website to download the version corresponding to the system .
After downloading, upload it to the deployed server and unzip it:

tar  -vxf  prometheus-2.40.5.linux-amd64.tar.gz

Edit system service startup file

vi/usr/lib/systemd/system/prometheus.service 
[Unit]
Description=prometheus
After=network.target

[Service]
Type=simple
ExecStart=/home/prometheus/prometheus/prometheus --config.file=/home/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/home/prometheus/prometheus/data --storage.tsdb.retention=30d --log.level=info --web.external-url=http://192.168.1.108:9090
Restart=on-failure

[Install]
WantedBy=multi-user.target

After saving, use systemctl to start/stop the Prometheus service

systemctl daemon-reload
systemctl enable prometheus
systemctl start prometheus

After startup, the web interface of Prometheus is shown in the figure below

insert image description here

1. Install process-exporter

Download process-exporter

wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.10/process-exporter-0.7.10.linux-amd64.tar.gz

Install and deploy process-exporter

tar -xvf process-exporter-0.7.10.linux-amd64.tar.gz
mv process-exporter-0.7.10.linux-amd64 process-exporter

Write a configuration file

cd process-exporter
vi config.yaml
process_names:
  - name: "{
   
   {.Matches}}"
    cmdline:
    - 'qtalk_api'
  - name: "{
   
   {.Matches}}"
    cmdline:
    - 'qtalk_user'
  - name: "{
   
   {.Matches}}"
    cmdline:
    - 'qtalk_auth'

Write a startup script

vi /usr/lib/system/system/process-exporter.service
[Unit]
Description=process_exporter
After=network.target

[Service]
User=root
Type=simple
ExecStart=/home/prometheus/process-exporter/process-exporter -config.path /home/prometheus/process-exporter/config.yaml
Restart=on-failure

[Install]
WantedBy=multi-user.target

start up

systemctl daemon-reload
systemctl enable process-exporter
systemctl start process-exporter

verify

curl 192.168.1.108:9256/metrics

Note: The metrics include: namedprocess_namegroup_num_procs{groupname="map[:qtalk_api]"} means that the startup is correct, otherwise, check whether the config.yaml configuration is correct.

Configure Prometheus
to add the following configuration at the end of the original prometheus.yml

- job_name: 'process'
static_configs:
-targets: ['192.168.1.108:9256']

Hot load Prometheus

./promtool check config  prometheus.yml
systemctl reload prometheus.service

2. Install Grafana

wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.1-1.x86_64.rpm
sudo yum install grafana-enterprise-9.3.1-1.x86_64.rpm
systemctl start grafana-server.service

After starting Grafana, log in using the grafana web interface, add the Prometheus data source, and then import the Grafana process monitoring template (https://grafana.com/grafana/dashboards/8378-system-processes-metrics/), you can Check the running status of the system process configured by process-exporter, as shown in the following figure:

insert image description here

Through the cooperation of Prometheus, Process-exporter, and Grafana, the status of each process running on the server can be displayed graphically, and with the alarm notification function of Prometheus, it is convenient for system operation and maintenance and developers to keep abreast of the system operating status , make resource adjustments in a timely manner to ensure high availability of the system.

The next article will introduce how to use the alarm component to match alarm rules and notify by email, DingTalk, and WeChat.

reference documents

prometheus
process-exporter
Prometheus — Process-exporter process monitoring

insert image description here

Guess you like

Origin blog.csdn.net/anyRTC/article/details/128529740