[Cloud Native] PromQL of Prometheus

 foreword

After Prometheus collects the corresponding monitoring indicator sample data through Exporter, we can query the monitoring sample data through PromQL, so as to analyze the corresponding data samples and formulate alarm rules.

1. Introduction to PromQL

PromQL (Prometheus Query Language) is the built-in data query language of Prometheus. Support users to perform real-time data query and aggregation operations.

Prometheus uniquely defines a time series based on the indicator name (metrics name) and the attached label set (labelset). The
indicator name represents the basic feature identification of a certain type of measurable attribute on the monitoring target
. The label is the multiple measurable dimensions subdivided on this basic feature

Based on PromQL expressions, users can perform filtering, aggregation, and statistical operations on specified features and subdivided latitudes to generate desired calculation results. PromQL uses expressions to express query requirements. According to the indicators and tags used, as well as the time range, the query request of the expression can be flexibly covered on samples within a certain range of one or more time series, or even a single sample that only includes a single time
series
.

 

 2. Understanding the meaning of PromQL data sample information

2.1  Prometheus  data model 

In Prometheus, each time series is uniquely identified by a metric name (Metric Name) and a label (Label). The
format is: <metric_name>{<label_name>=<label_value>, ...}

●Indicator name: usually used to describe a characteristic to be measured on the system
For example, prometheus_http_requests_total indicates the total number of HTTP requests received

Label: key-value data, appended to the indicator name, so that the indicator can support multi-latitude features; optional items
For example, prometheus_http_requests_total{code="200"} and prometheus_http_requests_total{code="302"} represent two different time series

●Double underlined labels (such as __address__ ) are the default labels of the Prometheus system and will not be displayed in the /metrics page;

●The default label of the system is not displayed on the target page, and it will be displayed only when the mouse is placed on the label field.
 

Common system default tags: 

__address__ The socket address of the current target instance <host>:<port>
__scheme__ The protocol used when collecting the indicator data on the current target (http or https)
__metrics_path__ The URI path used when collecting the indicator data on the current target, the default is /metrics
__param_<name> The value of the first parameter named <name> in the passed URL parameters
    __name__    
This tag is reserved for identifying the name of the indicator Labels, which can be filtered by the indicator name using the label selector
 

Precautions for the use of indicator names and labels:


A specific combination of indicator name and label represents a time series; combinations with the same indicator name but different labels represent different time series; different indicator names naturally represent different time series

PromQL supports filtering and aggregation based on defined indicator dimensions; changing any tag value, including adding or deleting tags, will create a new time series; tags should be kept as stable as possible, otherwise, new time series are likely to be created, and even a dynamic data environment will be generated, making it difficult to track the monitored data source, resulting in the invalidation of graphs, alarms, and recording rules based on the indicator
 

2.2 Sample data format 

Each data sample of Prometheus consists of two parts
Timestamp with millisecond precision
Data in float64 format  

2.3 PromQL data types 


 PromQL expressions support 4 data types:
Instant vector: a set of sample values ​​with the same time stamp on a specific or all time series collection Range vector: all sample values
​​within the specified time range on a specific or all time series collection Scalar data (Scalar): a floating-point data value String (String): a string that supports single quotes and double quotes
for
reference

 

2.4 Time series Selectors 


The query operation of PromQL may need to be performed on sample data on several time series, and selecting the target time series is the most critical step when constructing an expression;

Users can use the vector selector expression to select the instant sample values ​​of all or part of the time series under a given index name or the sample values ​​within a certain time range in the past . The former is called an instantaneous vector selector, and the latter is called an interval vector selector.
 

(1) Instant Vector Selectors (Instant Vector Selectors) 
Instant Vector Selectors can return 0, 1 or more time series at a given time stamp (instant) each of a sample.

The instantaneous vector selector consists of two parts:
◆Indicator name: used to limit the time series under a specific indicator, that is, responsible for filtering indicators; optional
◆Label selector: used to filter labels on the time series; defined in {}; optional

When defining a momentary vector selector, at least one of the above two parts should be given; thus the following three combinations exist :


Given only the index name , or using a label selector with a null value on the label name: return the instant samples of all time series under the given index

For example, prometheus_http_requests_total and prometheus_http_requests_total{} have the same function, both are used to return instant samples of each time series under this indicator


Given only label selector : Return all instant samples on all time series matching the given label selector

For example, {code="200", job="prometheus"} , such time series may have different indicator names


◆The combination of indicator name and label selector : return the instant samples on all time series under the given indicator and meeting the given label filter

For example, prometheus_http_requests_total{code="200", job="prometheus"} is used to return the instant sample of the time series whose indicator code is 200 and the job is prometheus

Tag selectors are used to define tag filter conditions. Currently, the following four matching operators are supported:
= : Exactly equal
! = : Not equal
=~ : Regular expression match
! ~ : Regular expression does not match

Precautions:

◆When a label selector with an empty label value is matched, all time series that do not define the label are also eligible

◆The regular expression will implement the full anchoring mechanism, which needs to match the entire value of the specified tag

◆Vector selector must contain at least one indicator name, or at least one label selector that will not match an empty string

For example, { job=""} is an illegal vector selector

◆Using __name__ as the label name can also filter the indicator name

(2) Range Vector Selectors 

The interval vector selector can return 0, 1, or multiple time series for a respective set of samples within a given time range value.
The difference of the interval vector selector is that the time range of the samples to be returned in the time series needs to be expressed by adding the duration contained in [] after the expression of the instantaneous vector selector .

Time range: take the current time as the reference time point, and point to a specific time length in the past; for example, [5m] means within the past 5 minutes.
◆The available time units are ms (milliseconds), s (seconds), m (minutes), h (hours), d (days), w (weeks) and y (years) ◆Integer time must be used, and multiple units of different levels can be combined in series, in descending order of time units, such as 1h30m, but 1.5h cannot be
used

 

2.5 Offset Vector Selector 

The selectors introduced above all take the current time as the base time by default, and the offset modifier is used to adjust the base time to make it shift forward for a period of time. The offset modifier follows the selector, using the keyword offset to specify the amount to offset. 

For example, prometheus_http_requests_total offset 5m means to obtain instant samples of all time series with prometheus_http_requests_total as the index name in the past 5 minutes;
prometheus_http_requests_total[5m] offset 1d means to obtain all samples within 5 minutes before the current 1 day

#Key points for using vector expressions:
The return value type of the expression is also one of the four data types: instant vector, range vector, title or string. However, some usage scenarios require the return value of the expression to meet specific conditions, for example: (1) When the return value needs to be drawn as a graph, only instant vector type data is supported; (2) For rate functions such as rate and irate, the required data must be interval
vector
type

●Because the interval vector selector returns interval vector data, it cannot be used for the graph drawing function in the expression browser

The interval vector selector is usually used in conjunction with the rate and irate functions of the rate class
 

2.6 Metric Types of PromQL 

PromQL has four metric types:


●Counter: The counter is used to save monotonically increasing data; for example, the number of site visits, etc. The data is monotonically increasing, does not support reduction, cannot be a negative value, and will be reset to 0 after restarting the process;


●Gauge: Dashboard, used to store indicator data with fluctuating characteristics, such as free memory size, etc. The data can be large or small; after restarting the process, it will be reset;


●Histogram: Cumulative histogram, which divides the data in the time range into different time periods, and evaluates the number of samples and the sum of the sample values ​​respectively, so the quantile can be calculated; ◆It can be used to analyze the problem of excessive average value caused by abnormal values; ◆The quantile calculation should use the dedicated histogram_quantile
    function
    ;


●Summary: similar to Histogram, but the quantile will be directly calculated and reported on the client side;
 

(1)  Counter type

     Usually, the total number of Counter has no direct effect, but needs to use functions such as rate, topk, increase and irate to generate the change status (growth rate/change rate) of the sample data:

#Get the top 3 time series
topk(3, prometheus_http_requests_total) of the total number of http requests under this indicator

 

#Get the growth rate
rate of the total number of http requests on each time series under this indicator within 1 hour(prometheus_http_requests_total[1h]) 

 

#irate is a high-sensitivity function used to calculate the instantaneous rate of the indicator. It is calculated based on the last two samples in the sample range. Compared with the rate function, irate is more suitable for the analysis of the rate of change in a short-term time range.
 
irate(prometheus_http_requests_total[1h])

 

(2) Gauge type 


 Gauge is used to store sample data of indicators whose values ​​can be increased or decreased. It is often used for aggregation calculations such as summation, average value, minimum value, maximum value, etc .; it is also often used in conjunction with PromQL's delta and predict_linear functions:

 The delta function calculates the difference between the first value and the last value of each time series element in the range vector, thereby displaying the difference between sample values ​​​​at different time points
 

eg: Return the difference delta between the CPU temperature on this server and 2 hours ago
(cpu_temp_celsius{host="node01"}[2h]) 

 The predict_linear function can predict the value of the time series v after t seconds, and it can predict the change trend of the sample data through linear regression

(3) Histogram type


For Prometheus, Histogram will sample data within a period of time (usually request duration or response size, etc.), and count it into a configurable bucket (storage bucket), and then filter samples through a specified interval, or count the total number of samples, and finally display the data as a histogram.

The division of Prometheus value interval adopts the cumulative interval interval mechanism, that is, the samples in each bucket include the samples in all previous buckets, so it is also called cumulative histogram.

Each indicator of the Histogram type has a base indicator name <basename>, which provides multiple time series:
<basename>_sum: the sum of all sample values

●<basename>_count: the total sampling times , which itself is essentially a Counter type indicator

●<basename>_bucket{le="<upper boundary>"} : The upper boundary of the observation bucket, that is, the sample statistical interval, indicating the number of samples whose sample value is less than or equal to the upper boundary <basename>_bucket{
 le="+Inf"} : The number of samples in the largest interval (including all samples)

#Using histogram 
In most cases, people generally tend to use the average value of certain quantitative indicators, such as the average CPU usage and the average response time of the page. The problem with this method is obvious. Take the average response time of system API calls as an example: if most API requests are maintained within the response time range of 100ms, while the response time of individual requests takes 5s, it will cause the response time of some web pages to fall to the median, and this phenomenon is called the long tail problem.
In order to distinguish between average slowness and long-tail slowness, the easiest way is to group requests according to the range of request delays. For example, count the number of requests with a delay between 0 and 10 ms, and the number of requests with a delay between 10 and 20 ms. In this way, the cause of system slowness can be quickly analyzed. Both Histogram and Summary are designed to solve such problems. Through the monitoring indicators of the Histogram and Summary types, we can quickly understand the distribution of monitoring samples.

 The number of requests with http request response time <= 0.005 seconds is 10
prometheus_http_request_duration_seconds_bucket{handler="/metrics",le="0.005"} 10

The sum of the sizes of all sample values, named <basename>_sum

prometheus_http_request_duration_seconds_sum{handler="/metrics"} 10.107670803000001

The total number of samples, named <basename>_count, the effect is the same as <basename>_bucket{le="+Inf"}

prometheus_http_request_duration_seconds_count{handler="/metrics"} 20

Note:
Bucket can be understood as a division of the value range of data indicators, and the basis for division should be based on the distribution of data values. Note that the following samples include the previous samples. Assuming that the value of prometheus_http_request_duration_seconds_bucket{...,le="0.01"} is 10, and the value of prometheus_http_request_duration_seconds_bucket{...,le="0.05"} is 30, it means that 10 of the 30 samples are less than 0.01s, and the response time of the remaining 20 sampling points is Between 0.01s and 0.05s.

 

The sample data generated by the accumulation interval mechanism needs to additionally use the built-in histogram_quantile function to calculate the corresponding quantile (quantile) according to the Histogram index, that is, the proportion of the sample number of a certain bucket in all the sample numbers.
When calculating the quantile, the histogram_quantile function assumes that the samples in each interval satisfy a linear distribution state, so its result is only an estimated value, which is not completely accurate. The accuracy of the estimate depends on the granularity of the bucket interval division; the larger the granularity, the lower the
accuracy

For example, suppose the upper boundary of the 9th quantile (quantile=0.9) of the sample of http request response time is 0.01, which means that the number of sample values ​​less than or equal to 0.01 accounts for 90% of the overall sample value
 

histogram_quantile(prometheus_http_request_duration_seconds_bucket{handler="/metrics",le="0.01"}) 0.9

(4) Summary type

Histogram is only simple bucket division and bucket counting on the client side, and the quantile calculation is estimated by Prometheus Server based on sample data, so the result may not be accurate, and even unreasonable bucket division will lead to large errors.

Summary is an indicator type similar to Histogram, but it performs statistics on each sampling point within a period of time (10 minutes by default) on the client side, calculates and stores the quantile value, and the server side can directly grab the corresponding value.

For each indicator, Summary is prefixed with the indicator name <basename> to generate the following indicator sequences:
<basename>_sum : counts the sum of all sample values

●<basename>_count: Count the total number of all samples

●<basename>{quantile="x"} : Quantile distribution of statistical sample values, quantile range: 0 ≤ x ≤ 1
 

 From the above sample, we can know that the total number of wal_fsync operations performed by the current Promtheus Server is 216 times, which takes 2.888716127000002s. Among them, the time consumption of the median (quantile=0.5) is 0.012352463s, and the time consumption of the 9th quantile (quantile=0.9) is 0.014458005s.

 (5) Similarities and differences between the Histogram type and the Summary type.
 Both of them contain the <basename>_sum and <basename>_count indicators. Histogram needs to calculate the quantile through the <basename>_bucket, while the Summary directly stores the quantile value.

 

3. Syntax application of PromQL

 3.1 Basic query

The general expression format of the basic query of Prometheus is <metric name>{label=value}, and the query

(1) Single index empty label query 

eg: Query the total number of HTTP requests received 

prometheus_http_requests_total

prometheus_http_requests_total{}

 Note: As long as there is one difference between the label items in {} behind the indicator, it means that it is an independent time series. When the index and label items are exactly the same, it is a time series

 The above uses the method of index or index plus empty label to perform a fuzzy query, and the label can also perform a fuzzy query (the label must exist in the form of key-value pairs, if only the key or value does not conform to the syntax of PromQL)

(2) Single label query 

eg: View all time series containing labels (instance="localhost:9090")

 

(3) Combination of indicators and tags to narrow the scope of query

eg: View the total number of http requests (the instance is local, and the code return code is 302) 

prometheus_http_requests_total{instance="localhost:9090",code="302"}

 

3.2 Time Range Query 


(1) Real-time time range query 
 In the above-mentioned basic query case, when we query through <metric name>{label=value}, the returned result will only contain the latest value of the time series. Such a result type is called an instant vector (instant vector). In addition to instantaneous vectors, PromQL also supports returning a set of data in a time series within a certain time range, which is called a range vector .

 

 eg: View the total number of http requests (the instance is the local machine, and the code return code is 302), and the viewing time range is within 1 minute
 

prometheus_http_requests_total{instance="localhost:9090",code="302"}[1m]

 

In the above figure, we can see all the sampling results within 1 minute

 In addition to using m for minutes, PromQL's time range selector supports other time units:

s - seconds

m - minutes

h - hours

d - day

w - week

y - the year

Note: Prometheus does not support the use of decimal point value ranges. For example, to query a time period of one and a half days, [1.5d] is not used for query, but [36h] 
 

( 2) Time displacement query 

In the time series query, in addition to using the current time as the benchmark, offset can also be used to move forward in time.

 eg: View the total number of http requests (the instance is the local machine, the code return code is 302), the request is five minutes ago (instantaneous vector)

prometheus_http_requests_total{instance="localhost:9090",code="302"}offset 5m

 

eg:  View the total number of http requests (the instance is the local machine, the code return code is 302), the request is five minutes ago, and the time range is within 1 minute

prometheus_http_requests_total{instance="localhost:9090",code="302"}[1m]offset 5m

 

3.3 Aggregation operation query

The PromQL language provides many built-in aggregation operators, which are used to aggregate the samples of the instantaneous vector to form a new sequence. The currently supported aggregation operators are as follows:

 

The aggregate operation syntax format in PromQL can take one of the following two formats:


● <aggregate function> (vector expression) by|without (label)
● <aggregate function> by|without (label) (vector expression)

(1) Calculate the sum of all http requests

sum(prometheus_http_requests_total{})

 

(2) Find the time series with the largest number of all http requests

max(prometheus_http_requests_total{})

 

 (3) Find the average number of all http requests

avg(prometheus_http_requests_total{})

 

(4) Use topk to display the top three time series data of value 

topk(3,prometheus_http_requests_total{})

 

(5) Use bottomk to view the time series of the last five places in value 

bottomk(5,prometheus_http_requests_total{})

 

In addition: in the aggregation operation, you can also add without or by to the expression, where without is used to remove the listed labels in the calculation sample, and by is the opposite, only the listed labels are kept in the result vector, and the rest of the labels are removed


#Calculate the sum of the time series and value values ​​​​except for the time series containing code, handler, and job  in the
label

3.4 Wildcard and regular expression matching label query


The following tags in the PromQL query statement often have a large number of tags with the same key but different values, which are two different time series. Time series can be quickly located through the relationship between value and key, and regular expressions can be used between key and value, and the fields in value can be fuzzy defined by wildcards

 PromQL commonly used regular expressions:
 

 

 (1) Match the time series whose code in the label is not 200

prometheus_http_requests_total{code!="200"}

2) The matching label handler is a time series starting with api

 
 

prometheus_http_requests_total{handler=~"/api.+"}​​​​​​​

prometheus_http_requests_total{handler=~"/api.*"}

 

 3.5 Use of operators

(1) Operators
 In PromQL queries, expression operators can also be used to perform more complex result queries. Among them, the addition, subtraction, multiplication, division and other methods used by the data operator calculate the sample value and return the calculated result.

All mathematical operators supported by PromQL are as follows:

+ (addition)

- (subtraction)

* (multiplication)

/ (division)

% (remainder)

^ (exponentiation)

eg: The unit of the memory value obtained through process_virtual_memory_bytes is byte. When we want to convert it to GB, we only need to use the following expression to process.
 

process_virtual_memory_bytes/(1024*1024*1024)

 

2) Comparison operators

 
 Comparison operators allow users to filter time series based on the values ​​of time series samples.

 The comparison operators supported by Prometheus are as follows:

== (equal)

!= (not equal)

> (greater than)

< (less than)

>= (greater than or equal to)

<= (less than or equal to)

eg: We only want to query the interface data whose Prometheus request volume is greater than 1,000, then we can use the following comparison expressions to filter.
 

 The comparison expression can also be matched with the bool modifier. After adding bool, the expression will no longer filter the data, but return 1 (true) or 0 (false) according to the comparison result. 

prometheus_http_requests_total>bool 1000

 

(3) Logical operators


Logical operators support three types of operations: and, or, and unless (excluded). Among them, and is a union, which is used to match the same result in an expression. unless is just the opposite of and, the matching result will exclude the same samples in the two, and only display the collection where the other part does not contain each other; while or has the widest matching range, it will not only match all the data in expression 1, but also match samples that are not the same in expression 2.

eg: time series samples with http requests between 100 and 1000

prometheus_http_requests_total<1000 and prometheus_http_requests_total>100

Time series samples with http requests greater than 100 or less than 1000 

prometheus_http_requests_total<1000 or prometheus_http_requests_total>100

 

Note: There is a priority among the operators of Prometheus, in which order from high to low is (^) > (*, /, %) > (+, -) > (==, !=, <=, <, >=, >) > (and, unless) > (or) , you need to pay attention to the priority relationship during use to avoid erroneous results.

3.6  Built-in functions 

 Prometheus has many built-in functions, and through the flexible application of these functions, it is more convenient to query and format data.

ceil function

The ceil function rounds up the value of the returned result to an integer.

Example:

ceil(avg(prometheus_http_requests_total{code="200"}))

floor function

The floor function is the opposite of ceil, it will be rounded down.

Example:

floor(avg(prometheus_http_requests_total{code="200"}))
rate函数

The rate function is the most frequently used and one of the most important functions. rate is used to get the average number of increments per second in a certain time interval, and it will count all data points in the time interval. The rate function is usually applied to indicators of the Counter type to understand the incremental situation.

Example: Get http_request_total within 2 minutes, the average number of new requests per second
 

rate(prometheus_http_requests_total{handler="/rules"}[1m])
irate函数

Compared to the rate function, irate provides higher sensitivity. The irate function calculates the growth rate of the interval vector through the last two sample data in the time interval, so as to avoid the situation where the average value in the range pulls down the peak value .

Example: the usage of this function is the same as rate

irate(prometheus_http_requests_total{handler="/rules"}[1m])

Other built-in functions

 In addition to the functions mentioned above, PromQL also provides a large number of other functions for use. The scope of functions covers the functions required for daily use, such as the label_replace function for label replacement, and the histogram_quantile function for counting the quantile of the Histogram indicator. For more information, please refer to the official document

4. Classic query example

(1) The average CPU usage of each host in the last 5 minutes

(1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance)) * 100

(2) Query whether the load average time series of 1 minute exceeds the number of host CPUs by 2 times

node_load1 > on (instance) 2 * count (node_cpu_seconds_total{mode="idle"}) by (instance)

(3) Calculate host memory usage
Available memory space: the sum of free memory, buffer, and cache indicators

node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes

Used memory space: total memory space minus free space

node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)

Utilization: used space divided by total space

(node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100

4) Calculate the total memory of all nodes and all containers:

sum  by (instance) (container_memory_usage_bytes{instance=~"node*"})/1024/1024/1024


(5) Calculate the cpu usage of all containers in the last 1m of the node01 node:

sum (rate(container_cpu_usage_seconds_total{instance="node01"}[1m])) / sum (machine_cpu_cores{instance="node01"}) * 100
#container_cpu_usage_seconds_total represents the sum of the time the container occupies the CPU


(6) Calculate the change rate of cpu usage of each container in the last 5m

sum (rate(container_cpu_usage_seconds_total[5m])) by (container_name)


(7) Query the CPU usage change rate of each Pod in the latest 1m in the K8S cluster

sum (rate(container_cpu_usage_seconds_total{image!="", pod_name!=""}[1m])) by (pod_name) #Because the 
queried data is related to the container, it is best to group and aggregate by Pod
 

Guess you like

Origin blog.csdn.net/zhangchang3/article/details/131813433