Prometheus: (1) Basic concepts

Table of contents

One: prometheus overview

1.1 What is prometheus?

1.2 Features of Prometheus

1.2.1 Powerful multi-dimensional data model:

1.2.2 Flexible and Powerful Query Statement (PromQL)

1.2.3 Easy to manage 

1.2.4 Efficient Storage

1.2.5 Use pull mode to collect time series data

1.2.6 The push gate way can be used to push the time series data to the Prometheus server

1.2.7 You can obtain monitoring targets through service discovery or static configuration

1.2.8 There are a variety of visual graphical interfaces, generally used in conjunction with grafana

1.2.9 Easy to scale

1.3 What is a sample

1.4 Prometheus limitations

1.5 Basic Principles

1.6 Prometheus usage scenarios

Two: Introduction to Prometheus components

Three: prometheus architecture diagram

Four: Prometheus workflow

Five: Comparative analysis of Prometheus and zabbix

Six: Several deployment modes of Prometheus

6.1 Basic High Availability Mode

6.2 Basic high availability + remote storage

6.3 Basic HA + remote storage + federated cluster solution

Seven: Four data types of Prometheus 

7.1Counter

7.2 Gauge

7.3Histogram

7.4Summary

Eight: Prometheus storage

8.1 Storage principle

8.2 Local storage 

8.3 Remote storage 

One: prometheus overview

1.1 What is prometheus?

Prometheus is an open source monitoring and alarm system and time series database (TSDB) developed by SoundCloud. Prometheus is developed in Go language and is an open source version of Google's BorgMon monitoring system.

Prometheus is an open source system monitoring and alarm system. In 2016, Google initiated the Cloud Native Computing Foundation under the Linux Foundation, and included Prometheus. Now it has become the second largest CNCF hosting platform after k8s. In the kubernetes container management system, prometheus is usually used for monitoring. It also supports multiple exporters to collect data, and also supports pushgateway for data reporting. The performance of Prometheus is sufficient to support tens of thousands of clusters.

1.2 Features of Prometheus

As a new generation of monitoring framework, Prometheus has the following characteristics:

1.2.1 Powerful multi-dimensional data model:

  • labelsTime series data are distinguished by metric (metric name) and label key-value pairs.
  • All metrics can be set with arbitrary multidimensional labels.
  • The data model is more casual and does not need to be deliberately set as a dot-separated string.
  • Aggregation, cutting and slicing operations can be performed on the data model.
  • Double-precision floating-point type is supported, and the label can be set to full unicode.

Each time series data is uniquely identified by the metric name and its label labels key-value pair collection:

The metric name specifies the measurement characteristics of the monitoring target system (eg: http_requests_total - total count of received http requests).

labels opens the multidimensional data model of Prometheus: for the same measurement name, through the combination of different label lists, a specific measurement dimension instance will be formed. (For example: all http requests containing the metric name /api/tracks are tagged with method=POST to form a specific http request).

The query language filters and aggregates based on these metrics and tag lists. Changing any label value on any metric will result in a new time series plot.

1.2.2  Flexible and Powerful Query Statement (PromQL)

1.2.3 Easy to manage 

  •  Prometheus server is a single binary file that can work directly locally without relying on distributed storage.

1.2.4  Efficient storage

  • On average, each sampling point only occupies 3.5 bytes, and a Prometheus server can handle millions of metrics.

1.2.5 Use pull mode to collect time series data

  • This is not only good for local testing but also prevents problematic servers from pushing bad metrics.

1.2.6  The time series data can be pushed to the Prometheus server by means of push gate way

1.2.7  You can obtain monitoring targets through service discovery or static configuration

1.2.8  There are a variety of visual graphical interfaces, generally used in conjunction with grafana

1.2.9  Easy to scale

1.3 What is a sample

Sample: Each point in the time series is called a sample, and the sample consists of the following three parts:

  • Indicator (metric): the indicator name and labelsets describing the characteristics of the current sample;
  • Timestamp (timestamp): a timestamp accurate to milliseconds;
  • Sample value (value): A folat64 floating-point data represents the value of the current sample.

Representation method: Represent the time series of the specified metric name and the specified label set by the following expression: <metric name>{<label name>=<label value>, ...}

For example, a time series with the metric name api_http_requests_total and the label method="POST" and handler="/messages" can be expressed as: api_http_requests_total{method="POST", handler="/messages"}

1.4 Prometheus limitations

  • Its data is stored in the memory, and the statistics before and after restarting are all cleared;
  • Each program only counts its own number of requests. If multiple points are deployed, the statistical data obtained from the request is only the data of a single point, without summarization, and the data is inaccurate;
  • Prometheus is based on metrics (Metric) monitoring, not suitable for logs (Logs), events (Event), call chain (Tracing); it shows more trend monitoring than precise data;
  • Prometheus believes that only the most recent monitoring data needs to be queried. The original design of its local storage is to save short-term (for example, one month) data, so it does not support the storage of a large amount of historical data. If you need to store long-term historical data, it is recommended Save data in systems such as InfluxDB or OpenTSDB based on the remote storage mechanism
  • The maturity of Prometheus cluster mechanism is not high

1.5 Basic Principles

The basic principle of Prometheus is to periodically capture the status of the monitored components through the HTTP protocol, and any component can access the monitoring as long as it provides the corresponding HTTP interface. No SDK or other integration process is required. This is very suitable for monitoring systems in virtualized environments, such as VM, Docker, Kubernetes, etc. The HTTP interface that outputs the information of the monitored component is called exporter. At present, most of the components commonly used by Internet companies have exporters that can be used directly, such as Varnish, Haproxy, Nginx, MySQL, and Linux system information (including disk, memory, CPU, network, etc.).

1.6 Prometheus usage scenarios

  1. Container monitoring, the service is deployed on the cloud server;
  2. Monitoring system under the k8s architecture;
  3. There is an urgent need for a monitoring system, but there is no special energy and cost to develop one;

Prometheus is great for recording any purely numeric time series. It is suitable for both machine-centric monitoring and monitoring of highly dynamic service-oriented architectures . Its support for multidimensional data collection and querying is a particular advantage in the microservices world.

Prometheus is designed for reliability, making it the system to use during outages to allow you to quickly diagnose problems.

Each Prometheus server is independent and does not depend on network storage or other remote services. You can rely on it when other parts of your infrastructure break, and you can use it without setting up extensive infrastructure.


Two: Introduction to Prometheus components

The Prometheus ecosystem consists of several components, many of which are optional:

(1) Prometheus Server: used to collect and store time series data.

(2) Client Library: The client library detects the application code. When Prometheus grabs the HTTP endpoint of the instance, the client library sends the current status of all tracked metrics to the prometheus server.

(3) Exporters:   Prometheus supports a variety of exporters, through which metrics data can be collected and sent to the prometheus server. All programs that provide monitoring data to the promtheus server can be called exporters

(4) Alertmanager:   After receiving alerts from the Prometheus server, it will deduplicate, group, and route to the corresponding receiver, and send an alarm. The common receiving methods are: email, WeChat, DingTalk, slack, etc.

(5) Grafana: monitoring dashboard, visual monitoring data

(6) Push Gateway: Each target host can report data to pushgateway, and then prometheus server pulls data from pushgateway uniformly.

(7) Service Discovery: Dynamically discover the Target to be monitored, thereby completing an important component of the monitoring configuration. It is especially useful in a containerized environment. This component is currently supported by Prometheus Server. User-supplied static resource lists, file-based discovery, e.g. using configuration management tools to generate resource lists that are automatically updated in Prometheus, auto-discovery.


Three: prometheus architecture diagram

As can be seen from the above figure, the entire ecosystem of Prometheus mainly includes prometheus server, Exporter, pushgateway, alertmanager, grafana, Web UI interface, Prometheus server consists of three parts, Retrieval, Storage, PromQL 

  • Retrieval Responsible for capturing monitoring indicator data on active target hosts
  • Storage Storage is mainly to store the collected data in the disk
  • PromQL It is a query language module provided by Prometheus.

Four: Prometheus workflow

  1. The Prometheus server can periodically pull monitoring indicator data from the active (up) target host (target). The monitoring data of the target host can be collected by the prometheus server by configuring a static job or service discovery . This method is the default pull method Pull indicators; the collected data can also be reported to the prometheus server through the pushgateway; the data of the corresponding components can also be collected through the exporter that comes with some components;
  2. The Prometheus server saves the collected monitoring indicator data to the local disk or database;
  3. The monitoring indicator data collected by Prometheus is stored in time series, and the triggered alarm is sent to alertmanager by configuring the alarm rule
  4. Alertmanager sends the alarm to email, WeChat or DingTalk by configuring the alarm receiver
  5. The web ui interface that comes with Prometheus provides PromQL query language, which can query monitoring data
  6. Grafana can access the prometheus data source and display the monitoring data in a graphical form

Note: To put it simply, it is to collect data, process data, visualize and display, and then perform data analysis for alarm processing.

a brief introdction:

Step 1: Prometheus server periodically pulls metrics from configured jobs or exporters, or from Pushgateway, or from other Prometheus servers.

  • The default pull method is pull, and the push method provided by pushgateway can also be used to obtain the data of each monitoring node.

Step 2: The Prometheus server stores the collected metrics locally, runs the defined alert.rules, cleans and organizes the data through certain rules, and stores the obtained results in a new time series. Record new time series or push alerts to Alertmanager.

  • The acquired data is stored in TSDB, a time-series database.

Step 3: Prometheus visualizes the collected data through PromQL and other APIs. Prometheus supports many ways of graph visualization,

  • For example, Grafana, the built-in Promdash, and the template engine provided by itself, etc.
  • Prometheus also provides an HTTP API query method to customize the required output.

Five: Comparative analysis of Prometheus and zabbix

Zabbix Prometheus
The backend is developed with C, and the interface is developed with PHP, which is very difficult to customize. The backend is developed with golang, and the frontend is Grafana, which can be solved by JSON editing. Less difficult to customize.
The maximum cluster size is 10000 nodes. Supports larger cluster sizes and is faster.
It is more suitable for monitoring the physical machine environment. It is more suitable for cloud environment monitoring and has better integration with OpenStack and Kubernetes.
Monitoring data is stored in a relational database, such as MySQL, and it is difficult to expand dimensions from existing data. Monitoring data is stored in a time series-based database, which facilitates new aggregation of existing data.
The installation is simple, and zabbix-server includes all server-side functions in one package. The installation is relatively complicated, and the monitoring, alarm, and interface all belong to different components.
The graphical interface is relatively mature, and basically all configuration operations can be completed on the interface. The interface is relatively weak, and many configurations need to modify configuration files.
The development time is longer, and there are ready-made solutions for many monitoring scenarios. It started to develop rapidly after 2015, but the development time is relatively short and its maturity is not as good as Zabbix.

Summarize:

If you are monitoring a physical machine, there is nothing wrong with Zabbix. Zabbix has an absolute advantage in traditional monitoring systems, especially in server-related monitoring. Even when the environment does not change very frequently, Zabbix will be better than Prometheus;

But if it's a cloud environment, unless Zabbix is ​​very slick and can do various customizations, it's better to use Prometheus, after all, that's what they do. Prometheus has begun to become the dominant and standard configuration for container monitoring. It is more compatible with kubernetes and will be widely used in the foreseeable future. If you just want to use the monitoring system, don't hesitate, Prometheus is right.


Six: Several deployment modes of Prometheus

6.1 Basic High Availability Mode

The basic HA mode can only ensure the availability of Promthues services, but it does not solve the data consistency and persistence problems between Prometheus Servers (data cannot be recovered after loss), nor can it be dynamically expanded. Therefore, this deployment method is suitable for scenarios where the monitoring scale is not large, Promthues Server does not migrate frequently, and only needs to save short-term monitoring data.

6.2 Basic high availability + remote storage

On the basis of solving the availability of the Promthues service, it also ensures the persistence of the data. When the Promthues Server goes down or the data is lost, it can be quickly restored. At the same time Promthues Server may be well migrated. Therefore, this solution is suitable for scenarios where the scale of user monitoring is not large, but it is hoped that the monitoring data can be persisted and the portability of Promthues Server can be ensured at the same time.

6.3 Basic HA + remote storage + federated cluster solution

The performance bottleneck of Promthues mainly lies in a large number of collection tasks. Therefore, users need to use the characteristics of Prometheus federated clusters to divide different types of collection tasks into different Promthues sub-services to achieve functional partitioning. For example, a Promthues Server is responsible for collecting infrastructure-related monitoring indicators, and another Prometheus Server is responsible for collecting application monitoring indicators. Then there is the upper layer Prometheus Server to realize the aggregation of data.


Seven: Four data types of Prometheus 

7.1Counter

Counter is a counter type :

  • Counter is used to accumulate values, such as recording the number of requests, task completions, and errors.
  • always increasing, never decreasing.
  • After restarting the process, it will be reset.
# Counter类型示例
http_response_total{method="GET",endpoint="/api/tracks"}  100
http_response_total{method="GET",endpoint="/api/tracks"}  160

Counter type data allows users to easily understand the changes in the rate of event generation. The relevant operation functions built in PromQL can provide corresponding analysis, such as HTTP application request volume to illustrate

1) Obtain the growth rate of HTTP requests through the rate() function: rate(http_requests_total[5m])

2) Query the top 10 HTTP addresses in the current system: topk(10, http_requests_total)

7.2 Gauge

Gauge is a gauge type :

  • Gauge is a general value, such as temperature changes, memory usage changes.
  • Can be big, can be small.
  • After restarting the process, it will be reset
# Gauge类型示例
memory_usage_bytes{host="master-01"}   100
memory_usage_bytes{host="master-01"}   30
memory_usage_bytes{host="master-01"}   50
memory_usage_bytes{host="master-01"}   80 

delta() For monitoring indicators of the Gauge type, the changes in samples over a period of time can be obtained through PromQL built-in functions  , for example, to calculate the difference in CPU temperature within two hours:
dalta(cpu_temp_celsius{host="zeus"}[2h])

You can also use the PromQL built-in function predict_linear() to predict the changing trend of the sample data based on simple linear regression. For example, based on 2 hours of sample data, to predict the remaining free disk space of the host after 4 hours: predict_linear(node_filesystem_free{job="node"}[2h], 4 * 3600) < 0

7.3Histogram

Histogram function and characteristics

Histogram is a histogram . In the query language of the Prometheus system, it has three functions:

1) Sampling data over a period of time (usually request duration or response size, etc.) and counting it into a configurable bucket ( ). Subsequent samples can be filtered through a specified bucketinterval, or the total number of samples can be counted , and finally display the data as a histogram.

2) Accumulate the value of each sampling point ( sum)

3) The cumulative sum of the number of sampling points ( count)

Metric name : [basename] the name of the above three types of role metrics

1)[basename]bucket{le="上边界"}, 这个值为小于等于上边界的所有采样点数量
2)[basename]_sum_
3)[basename]_count

Note: If you define a metric type as Histogram, Prometheus will automatically generate three corresponding metrics

Why do you need to use a histogram histogram?

In most cases, people tend to use the average value of certain quantitative indicators, such as the average CPU usage and the average response time of the page. The problem with this method is obvious. Take the average response time of system API calls as an example: if most API requests are maintained within the response time range of 100ms, and the response time of individual requests takes 5s, it will cause some WEB The response time of the page falls to the median, and this phenomenon is called the long tail problem.

​ In order to distinguish between average slowness and long-tail slowness, the easiest way is to group requests according to the range of request delays. For example, count the number of requests with a delay between 0 and 10ms, and the number of requests with a delay between 10 and 20ms. In this way, the cause of system slowness can be quickly analyzed. Both Histogram and Summary are designed to solve such problems. Through the monitoring indicators of the Histogram and Summary types, we can quickly understand the distribution of monitoring samples.

A sample of the Histogram type will provide three indicators (assuming the indicator name is  <basename>):

1) The value of the sample is distributed in the number of buckets, named as  <basename>_bucket{le="<上边界>"}. To explain it more easily, this value indicates the number of all samples whose index value is less than or equal to the upper boundary.

# 1、在总共2次请求当中。http 请求响应时间 <=0.005 秒 的请求次数为0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.005",} 0.0
 
# 2、在总共2次请求当中。http 请求响应时间 <=0.01 秒 的请求次数为0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.01",} 0.0
 
# 3、在总共2次请求当中。http 请求响应时间 <=0.025 秒 的请求次数为0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.025",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.05",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.075",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.1",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.25",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.5",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="0.75",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="1.0",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="2.5",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="5.0",} 0.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="7.5",} 2.0
 
# 4、在总共2次请求当中。http 请求响应时间 <=10 秒 的请求次数为 2
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="10.0",} 2.0
io_namespace_http_requests_latency_seconds_histogram_bucket{path="/",method="GET",code="200",le="+Inf",} 2.0

2) The sum of the sizes of all sample values, named <basename>_sum

# 实际含义: 发生的2次 http 请求总的响应时间为 13.107670803000001 秒
io_namespace_http_requests_latency_seconds_histogram_sum{path="/",method="GET",code="200",} 13.107670803000001

3) The total number of samples, named as  <basename>_count, value and  <basename>_bucket{le="+Inf"} the same

# 实际含义: 当前一共发生了 2 次 http 请求
io_namespace_http_requests_latency_seconds_histogram_count{path="/",method="GET",code="200",} 2.0

Note :

1) Bucket can be understood as a division of the value range of data indicators, and the basis for division should be based on the distribution of data values. Note that the following sampling points include the previous sampling points. Suppose the value of xxx_bucket{...,le="0.01"} is 10, and the value of xxx_bucket{...,le="0.05"} is 30, then It means that among the 30 sampling points, 10 are less than 0.01s, and the response time of the remaining 20 sampling points is between 0.01s and 0.05s.

2) The quantile of the Histogram type sample can be calculated by the histogram_quantile() function. Quantiles may not be easy to understand, you can understand them as points to divide the data. Let me give you an example, assuming that the value of the 9th quantile ( quantile=0.9) of the sample is x, which means that the number of sampled values ​​smaller than x accounts for 90% of the total sampled values. Histogram can also be used to calculate application performance index value (Apdex score).

7.4Summary

Similar to the Histogram type, it is used to represent data sampling results over a period of time (usually request duration or response size, etc.), but it directly stores quantiles (calculated by the client and then displayed), rather than through intervals to calculate. It also has three functions:

1) Make statistics for each sampling point and form a quantile map. (For example: the same as the normal distribution, count the proportion of students who failed below 60 points, count the proportion of students below 80 points, and count the proportion of students below 95 points)

2) Count the total grades of all students in the class (sum)

3) Count the total number of students in the class taking the exam (count)

A summary of [basename] with a metric is like a named when fetching time series data.

1. The observation time φ-quantiles (0 ≤ φ ≤ 1)is displayed as[basename]{分位数="[φ]"}

2, [basename]_sum, refers to the sum of all observations_

3, [basename]_count, refers to the observed event count value

Quantile distribution of sample values, named  <basename>{quantile="<φ>"}.

# 1、含义:这 12 次 http 请求中有 50% 的请求响应时间是 3.052404983s
io_namespace_http_requests_latency_seconds_summary{path="/",method="GET",code="200",quantile="0.5",} 3.052404983
 
# 2、含义:这 12 次 http 请求中有 90% 的请求响应时间是 8.003261666s
io_namespace_http_requests_latency_seconds_summary{path="/",method="GET",code="200",quantile="0.9",} 8.003261666

The sum of sizes of all sample values, named  <basename>_sum.

# 1、含义:这12次 http 请求的总响应时间为 51.029495508s
io_namespace_http_requests_latency_seconds_summary_sum{path="/",method="GET",code="200",} 51.029495508

The total number of samples, named  <basename>_count.

# 1、含义:当前一共发生了 12 次 http 请求
io_namespace_http_requests_latency_seconds_summary_count{path="/",method="GET",code="200",} 12.0

Similarities and differences between Histogram and Summary :

They all contain  <basename>_sum and  <basename>_count indicators, Histogram needs to be used to  <basename>_bucket calculate the quantile, and the Summary directly stores the value of the quantile.

prometheus_tsdb_wal_fsync_duration_seconds{quantile="0.5"} 0.012352463
prometheus_tsdb_wal_fsync_duration_seconds{quantile="0.9"} 0.014458005
prometheus_tsdb_wal_fsync_duration_seconds{quantile="0.99"} 0.017316173
prometheus_tsdb_wal_fsync_duration_seconds_sum 2.888716127000002
prometheus_tsdb_wal_fsync_duration_seconds_count 216
 
# 从上面的样本中可以得知当前Promtheus Server进行wal_fsync操作的总次数为216次,耗时2.888716127000002s。其中中位数(quantile=0.5)的耗时为0.012352463,9分位数(quantile=0.9)的耗时为0.014458005s。

Eight: Prometheus storage

8.1 Storage principle

1. Prometheus provides a local storage (TSDB) time-series database storage method. After version 2.0, the ability to compress data has been greatly improved. In the case of a single node, it can meet the needs of most users, but local storage hinders prometheus The implementation of clustering, so other time series data should be used instead in the cluster, such as influxdb.

2. Prometheus stores data in the form of blocks. Every 2 hours is a time unit. It will be stored in the memory first, and will be automatically written to the disk when it reaches 2 hours.

3. In order to prevent data loss due to program exceptions, the WAL mechanism is adopted, that is, while the data recorded within 2 hours is stored in the memory, a log will also be recorded and stored in the wal directory under the block. When the program starts again, the data in the wal directory will be written into the corresponding block, so as to achieve the effect of data recovery.

8.2 Local storage 

It will be saved directly to the local disk. In terms of performance, it is recommended to use SSD and not save data for more than one month. Remember, NFS is not supported by any version of Prometheus. Some actual production cases tell us that if Prometheus stores files using NFS, there is a possibility of damage or loss of historical data.

8.3 Remote storage 

It is suitable for storing a large amount of monitoring data. The remote storage supported by Prometheus includes OpenTSDB, InfluxDB, Elasticsearch, Kakfa, PostgreSQL, etc. Remote storage needs to be converted with the adapter of the middle layer, mainly involving the remote_write and remote_read interfaces in Prometheus. In actual production, various problems will occur in remote storage, which requires continuous optimization, stress testing, architecture transformation, and even rewriting of modules for uploading data logic.

Guess you like

Origin blog.csdn.net/ver_mouth__/article/details/126269699