Cloud native in-depth analysis of the principle analysis and actual operation of Prometheus Pushgetway

1. Introduction to Pushgetway

  • Pushgateway is a component of Prometheus. Prometheus server actively obtains data through Exporter by default (pull is used to pull data by default), and Pushgateway actively pushes data to Pushgateway through exporter, and then Prometheus actively pulls Pushgateway data. Users can write Some custom monitoring scripts send the data to be monitored to Pushgateway. From the perspective of the Prometheus server, it is the Prometheus server that actively pulls data from various data sources (such as Exporter and Pushgateway).
  • Advantages of Pushgateways:
    • Prometheus uses the timing pull mode to pull the targets data by default, but if it is not in a subnet or a firewall, Prometheus cannot pull the targets data, so you can use each target to push data on the pushgateway, and then Prometheus goes to the Pushgateway to pull the data regularly;
    • When monitoring business data, it is necessary to summarize different data. The summarized data can be collected by Pushgateway, and then pulled by Prometheus to relieve pressure on Prometheus;
    • Custom collection indicators are simple.
  • Disadvantages of Pushgateways:
    • Prometheus pull status is only for Pushgateway, not valid for every node;
    • If there is a problem with the Pushgateway, there will be a problem with the entire collected data;
    • Pushgateway can persist all the monitoring data pushed to it, so even if the monitoring is offline, Prometheus will still pull the old monitoring data, and you need to manually clean up the data that Pushgateway does not want.
  • For in-depth learning of Prometheus, please refer to:
  • For the overall analysis of Prometheus, you can refer to my previous blog: Cloud native in-depth analysis of Prometheus installation and deployment and principle analysis .

insert image description here

2. Pushgateway architecture

  • Pushgateway is a data transfer station that provides APIs to support data producers to push data over at any time.
  • Pushgateway provides the exporter function, and when the Promethus server pulls data, it feeds back the data saved by itself to the Promethus server.

insert image description here

3. Prometheus server installation

① Download

  • Prometheus is written based on Golang. The compiled software package does not depend on any third-party dependencies. You only need to download the binary package of the corresponding platform, decompress it and add basic configuration to start Prometheus Server normally: (download address: Prometheus server ) :
wget https://github.com/prometheus/prometheus/releases/download/v2.40.6/prometheus-2.40.6.linux-amd64.tar.gz

tar -xf prometheus-2.40.6.linux-amd64.tar.gz

② Configuration

  • After decompression, the current directory will contain the default Prometheus configuration file promethes.yml. The following configuration file is briefly analyzed:
# 全局配置
global:
  scrape_interval:     15s # 设置抓取间隔,默认为1分钟
  evaluation_interval: 15s #估算规则的默认周期,每15秒计算一次规则。默认1分钟
  # scrape_timeout  #默认抓取超时,默认为10s

# Alertmanager相关配置
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# 规则文件列表,使用'evaluation_interval' 参数去抓取
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

#  抓取配置列表
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']

③ Start the service

# 查看帮助
./prometheus -h

# 直接启动服务,但是不提倡这种,因为退出控制台服务也就退出了,虽然可以加nohup启动,但是也不是特别友好,下面将配置prometheus.server启动
# 默认端口是:9090,如需要修改默认端口,可以使用--web.listen-address=:9099,还可以指定配置文件--config.file=prometheus.yml
./prometheus
  • Configure the prometheus.service startup script:
cat >/usr/lib/systemd/system/prometheus.service<<EOF
[Unit]
Description=Prometheus
After=network.target
[Service]
Type=simple
ExecStart=/opt/prometheus/prometheus_server/prometheus-2.40.6.linux-amd64/prometheus --config.file=/opt/prometheus/prometheus_server/prometheus-2.40.6.linux-amd64/prometheus.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
  • Start the service:
# 执行 systemctl daemon-reload 命令重新加载systemd
systemctl daemon-reload
# 启动
systemctl start prometheus
# 检查
systemctl status prometheus
netstat -tnlp|grep :9090
ps -ef|grep prometheus

insert image description here

  • Web access http://ip:9090:

insert image description here

4. Pushgateway installation

① Download

wget https://github.com/prometheus/pushgateway/releases/download/v1.5.1/pushgateway-1.5.1.linux-amd64.tar.gz

② Start the service

# 查看帮助
./pushgateway  -h

# 启动服务,这里也不使用直接启动的方式,配置pushgateway.service启动
./pushgateway
  • The default listening port is 9091, which can be changed by the following configuration:
usage: pushgateway [<flags>]
Flags:
      --web.listen-address=":9091"  		监听Web界面,API和遥测的地址
      --web.telemetry-path="/metrics"  		公开metrics的路径
      --web.external-url=        			可从外部访问PushgatewayURL
      --web.route-prefix=""      			Web端点内部路由的前缀。 默认为--web.external-url的路径
      --persistence.file=""      			归档以保留metrics。 如果为空,则metrics仅保留在内存中
      --persistence.interval=5m  			写入持久性文件的最小间隔
      --log.level="info"         			仅记录具有给定严重性或更高严重性的消息。 有效级别:[debug, info, warn, error, fatal]
      --log.format="logger:stderr"  		设置日志目标和格式,示例:“ logger:syslog?appname = bob&local = 7”或“ logger:stdout?json = true--version                  			显示应用程序版本
  • Configure the pushgateway.service startup script:
cat >/usr/lib/systemd/system/pushgateway.service<<EOF
[Unit]
Description=Pushgetway
After=network.target
[Service]
Type=simple
ExecStart=/opt/prometheus/pushgateway/pushgateway-1.5.1.linux-amd64/pushgateway
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
  • Start the service:
# 执行 systemctl daemon-reload 命令重新加载systemd
systemctl daemon-reload
# 启动
systemctl start pushgateway
# 检查
systemctl status pushgateway
netstat -tnlp|grep :9091
ps -ef|grep pushgateway

insert image description here

  • Web access ip:9091/metrics:

insert image description here

③ Access to Prometheus

  • Change the prometheus configuration file and add the following content:
  - job_name: 'pushgateway_name' 
    scrape_interval: 30s 
    honor_labels: true  #加上此配置,exporter节点上传数据中的一些标签将不会被pushgateway节点的相同标签覆盖 
    static_configs: 
        - targets: ["192.168.182.110:9091"] 
          labels: 
              instance: pushgateway_instance               
# pushgateway 中的数据我们通常按照 job 和 instance 分组分类,所以这两个参数不可缺少
  • Restart the Prometheus service, or perform hot loading:
# curl -X POST http://192.168.182.110:9090/-/reload
systemctl restatus prometheus
  • Check the prometheus web interface http://ip:9090/targets again:

insert image description here

5. Practical operation demonstration

① Push data

  • The URL part of the push path is defined as (the job is a required parameter, the label_name part is optional, and the combination of job and label in the URL uniquely identifies the Group in the pushgateway):
/metrics/job/<JOB_NAME>{
    
    /<LABEL_NAME>/<LABEL_VALUE>}
  • In the pushed data section, the format is defined as follows:
## TYPE metric_name type
metric_name{
    
    lable_name="label_value",...}  value
  • Push a group defined as {job="some_job"} data:
echo "some_metric 3.14" | curl --data-binary @- http://192.168.182.110:9091/metrics/job/some_job
  • Push a group defined as {job="some_job", instance="some_instance"} data:
#  --data-binary 表示发送二进制数据,注意:它是使用POST方式发送的!
cat <<EOF | curl --data-binary @- http://192.168.182.110:9091/metrics/job/some_job/instance/some_instance
  # TYPE some_metric counter
  some_metric2{
    
    label="val1"} 42
  # TYPE another_metric gauge
  # HELP another_metric Just an example.
  another_metric 2398.283
EOF

② Delete data

  • Delete all data under group defined as {job="some_job"}:
curl -X DELETE http://192.168.182.110:9091/metrics/job/some_job/instance/some_instance
  • Delete all metrics under all groups (the command line parameter –web.enable-admin-api must be added when starting pushgateway):
curl -X PUT http://192.168.182.110:9091/api/v1/admin/wipe
  • illustrate:
    • Data is deleted in units of Group, which is uniquely identified by the job name and the label in the URL;
    • The statement to delete {job="some_job"} data in the example will not delete {job="some_job", instance="some_instance"} data, because they belong to different groups. If you need to delete {job="some_job", instance ="some_instance"}, you need to use;
    • Deleting data refers to deleting the data in the pushgateway, which has nothing to do with promethues.
  • The demonstration in the example is provided by the official .

③ Customize the method of writing scripts, send pushgateway collection

  • The template is as follows:
cat <<EOF | curl --data-binary @- http://192.168.182.110:9091/metrics/job/some_job/instance/some_instance
# A histogram, which has a pretty complex representation in the text format:
# HELP http_request_duration_seconds A histogram of the request duration.
# TYPE http_request_duration_seconds histogram
disk_usage{
    
    instance="local-168-182-110",job="disk",disk_name="/run/user/0"} 11
disk_usage{
    
    instance="local-168-182-110",job="disk",disk_name="/run/user/1"} 22
disk_usage{
    
    instance="local-168-182-110",job="disk",disk_name="/run/user/2"} 33
disk_usage{
    
    instance="local-168-182-110",job="disk",disk_name="/run/user/3"} 44
disk_usage{
    
    instance="local-168-182-110",job="disk",disk_name="/run/user/4"} 55
EOF
  • Write a collection script to push data to Pushgateway:
cat >disk_usage_metris.sh<<EOF
#!/bin/bash

hostname=`hostname -f | cut -d '.' -f1`

metrics=""
for line in `df |awk 'NR>1{
    
    print $NF "=" int($(NF-1))}'`
do
  disk_name=`echo $line|awk -F'=' '{
    
    print $1}'`
  disk_usage=`echo $line|awk -F'=' '{
    
    print $2}'`
  metrics="$metrics\ndisk_usage{instance=\"$hostname\",job=\"disk\",disk_name=\"$disk_name\"} $disk_usage"
done

echo -e "# A histogram, which has a pretty complex representation in the text format:\n# HELP http_request_duration_seconds A histogram of the request duration.\n# TYPE http_request_duration_seconds histogram\n$metrics" | curl --data-binary @- http://192.168.182.110:9091/metrics/job/pushgateway/instance/disk_usage
EOF
  • Check out the Pushgetway web:

insert image description here

  • View Prometheus web:

insert image description here

Guess you like

Origin blog.csdn.net/Forever_wj/article/details/131865621