foreword
After Prometheus collects the corresponding monitoring indicator sample data through Exporter, we can query the monitoring sample data through PromQL, so as to analyze the corresponding data samples and formulate alarm rules.
1. Introduction to PromQL
PromQL (Prometheus Query Language) is the built-in data query language of Prometheus. Support users to perform real-time data query and aggregation operations.
Prometheus uniquely defines a time series based on the indicator name (metrics name) and the attached label set (labelset). The
indicator name represents the basic feature identification of a certain type of measurable attribute on the monitoring target
. The label is the multiple measurable dimensions subdivided on this basic feature
Based on PromQL expressions, users can perform filtering, aggregation, and statistical operations on specified features and subdivided latitudes to generate desired calculation results. PromQL uses expressions to express query requirements. According to the indicators and tags used, as well as the time range, the query request of the expression can be flexibly covered on samples within a certain range of one or more time series, or even a single sample that only includes a single time
series
.
2. Understanding the meaning of PromQL data sample information
2.1 Prometheus data model
In Prometheus, each time series is uniquely identified by a metric name (Metric Name) and a label (Label). The
format is: <metric_name>{<label_name>=<label_value>, ...}
●Indicator name: usually used to describe a characteristic to be measured on the system
For example, prometheus_http_requests_total indicates the total number of HTTP requests received
Label: key-value data, appended to the indicator name, so that the indicator can support multi-latitude features; optional items
For example, prometheus_http_requests_total{code="200"} and prometheus_http_requests_total{code="302"} represent two different time series
●Double underlined labels (such as __address__ ) are the default labels of the Prometheus system and will not be displayed in the /metrics page;
●The default label of the system is not displayed on the target page, and it will be displayed only when the mouse is placed on the label field.
Common system default tags:
__address__ The socket address of the current target instance <host>:<port>
__scheme__ The protocol used when collecting the indicator data on the current target (http or https)
__metrics_path__ The URI path used when collecting the indicator data on the current target, the default is /metrics
__param_<name> The value of the first parameter named <name> in the passed URL parameters
__name__
This tag is reserved for identifying the name of the indicator Labels, which can be filtered by the indicator name using the label selector
Precautions for the use of indicator names and labels:
● A specific combination of indicator name and label represents a time series; combinations with the same indicator name but different labels represent different time series; different indicator names naturally represent different time series
PromQL supports filtering and aggregation based on defined indicator dimensions; changing any tag value, including adding or deleting tags, will create a new time series; tags should be kept as stable as possible, otherwise, new time series are likely to be created, and even a dynamic data environment will be generated, making it difficult to track the monitored data source, resulting in the invalidation of graphs, alarms, and recording rules based on the indicator
2.2 Sample data format
Each data sample of Prometheus consists of two parts
Timestamp with millisecond precision
Data in float64 format
2.3 PromQL data types
PromQL expressions support 4 data types:
Instant vector: a set of sample values with the same time stamp on a specific or all time series collection Range vector: all sample values
within the specified time range on a specific or all time series collection Scalar data (Scalar): a floating-point data value String (String): a string that supports single quotes and double quotes
for
reference
2.4 Time series Selectors
The query operation of PromQL may need to be performed on sample data on several time series, and selecting the target time series is the most critical step when constructing an expression;
Users can use the vector selector expression to select the instant sample values of all or part of the time series under a given index name or the sample values within a certain time range in the past . The former is called an instantaneous vector selector, and the latter is called an interval vector selector.
(1) Instant Vector Selectors (Instant Vector Selectors)
Instant Vector Selectors can return 0, 1 or more time series at a given time stamp (instant) each of a sample.
The instantaneous vector selector consists of two parts:
◆Indicator name: used to limit the time series under a specific indicator, that is, responsible for filtering indicators; optional
◆Label selector: used to filter labels on the time series; defined in {}; optional
When defining a momentary vector selector, at least one of the above two parts should be given; thus the following three combinations exist :
◆ Given only the index name , or using a label selector with a null value on the label name: return the instant samples of all time series under the given index
For example, prometheus_http_requests_total and prometheus_http_requests_total{} have the same function, both are used to return instant samples of each time series under this indicator
◆ Given only label selector : Return all instant samples on all time series matching the given label selector
For example, {code="200", job="prometheus"} , such time series may have different indicator names
◆The combination of indicator name and label selector : return the instant samples on all time series under the given indicator and meeting the given label filter
For example, prometheus_http_requests_total{code="200", job="prometheus"} is used to return the instant sample of the time series whose indicator code is 200 and the job is prometheus
Tag selectors are used to define tag filter conditions. Currently, the following four matching operators are supported:
= : Exactly equal
! = : Not equal
=~ : Regular expression match
! ~ : Regular expression does not match
Precautions:
◆When a label selector with an empty label value is matched, all time series that do not define the label are also eligible
◆The regular expression will implement the full anchoring mechanism, which needs to match the entire value of the specified tag
◆Vector selector must contain at least one indicator name, or at least one label selector that will not match an empty string
For example, { job=""} is an illegal vector selector
◆Using __name__ as the label name can also filter the indicator name
(2) Range Vector Selectors
The interval vector selector can return 0, 1, or multiple time series for a respective set of samples within a given time range value.
The difference of the interval vector selector is that the time range of the samples to be returned in the time series needs to be expressed by adding the duration contained in [] after the expression of the instantaneous vector selector .
Time range: take the current time as the reference time point, and point to a specific time length in the past; for example, [5m] means within the past 5 minutes.
◆The available time units are ms (milliseconds), s (seconds), m (minutes), h (hours), d (days), w (weeks) and y (years) ◆Integer time must be used, and multiple units of different levels can be combined in series, in descending order of time units, such as 1h30m, but 1.5h cannot be
used
2.5 Offset Vector Selector
The selectors introduced above all take the current time as the base time by default, and the offset modifier is used to adjust the base time to make it shift forward for a period of time. The offset modifier follows the selector, using the keyword offset to specify the amount to offset.
For example, prometheus_http_requests_total offset 5m means to obtain instant samples of all time series with prometheus_http_requests_total as the index name in the past 5 minutes;
prometheus_http_requests_total[5m] offset 1d means to obtain all samples within 5 minutes before the current 1 day
#Key points for using vector expressions:
The return value type of the expression is also one of the four data types: instant vector, range vector, title or string. However, some usage scenarios require the return value of the expression to meet specific conditions, for example: (1) When the return value needs to be drawn as a graph, only instant vector type data is supported; (2) For rate functions such as rate and irate, the required data must be interval
vector
type
●Because the interval vector selector returns interval vector data, it cannot be used for the graph drawing function in the expression browser
The interval vector selector is usually used in conjunction with the rate and irate functions of the rate class
2.6 Metric Types of PromQL
PromQL has four metric types:
●Counter: The counter is used to save monotonically increasing data; for example, the number of site visits, etc. The data is monotonically increasing, does not support reduction, cannot be a negative value, and will be reset to 0 after restarting the process;
●Gauge: Dashboard, used to store indicator data with fluctuating characteristics, such as free memory size, etc. The data can be large or small; after restarting the process, it will be reset;
●Histogram: Cumulative histogram, which divides the data in the time range into different time periods, and evaluates the number of samples and the sum of the sample values respectively, so the quantile can be calculated; ◆It can be used to analyze the problem of excessive average value caused by abnormal values; ◆The quantile calculation should use the dedicated histogram_quantile
function
;
●Summary: similar to Histogram, but the quantile will be directly calculated and reported on the client side;
(1) Counter type
Usually, the total number of Counter has no direct effect, but needs to use functions such as rate, topk, increase and irate to generate the change status (growth rate/change rate) of the sample data:
#Get the top 3 time series
topk(3, prometheus_http_requests_total) of the total number of http requests under this indicator
#Get the growth rate
rate of the total number of http requests on each time series under this indicator within 1 hour(prometheus_http_requests_total[1h])
#irate is a high-sensitivity function used to calculate the instantaneous rate of the indicator. It is calculated based on the last two samples in the sample range. Compared with the rate function, irate is more suitable for the analysis of the rate of change in a short-term time range.
irate(prometheus_http_requests_total[1h])
(2) Gauge type
Gauge is used to store sample data of indicators whose values can be increased or decreased. It is often used for aggregation calculations such as summation, average value, minimum value, maximum value, etc .; it is also often used in conjunction with PromQL's delta and predict_linear functions:
The delta function calculates the difference between the first value and the last value of each time series element in the range vector, thereby displaying the difference between sample values at different time points
eg: Return the difference delta between the CPU temperature on this server and 2 hours ago
(cpu_temp_celsius{host="node01"}[2h])
The predict_linear function can predict the value of the time series v after t seconds, and it can predict the change trend of the sample data through linear regression
(3) Histogram type
For Prometheus, Histogram will sample data within a period of time (usually request duration or response size, etc.), and count it into a configurable bucket (storage bucket), and then filter samples through a specified interval, or count the total number of samples, and finally display the data as a histogram.
The division of Prometheus value interval adopts the cumulative interval interval mechanism, that is, the samples in each bucket include the samples in all previous buckets, so it is also called cumulative histogram.
Each indicator of the Histogram type has a base indicator name <basename>, which provides multiple time series:
<basename>_sum: the sum of all sample values
●<basename>_count: the total sampling times , which itself is essentially a Counter type indicator
●<basename>_bucket{le="<upper boundary>"} : The upper boundary of the observation bucket, that is, the sample statistical interval, indicating the number of samples whose sample value is less than or equal to the upper boundary <basename>_bucket{
le="+Inf"} : The number of samples in the largest interval (including all samples)
#Using histogram
In most cases, people generally tend to use the average value of certain quantitative indicators, such as the average CPU usage and the average response time of the page. The problem with this method is obvious. Take the average response time of system API calls as an example: if most API requests are maintained within the response time range of 100ms, while the response time of individual requests takes 5s, it will cause the response time of some web pages to fall to the median, and this phenomenon is called the long tail problem.
In order to distinguish between average slowness and long-tail slowness, the easiest way is to group requests according to the range of request delays. For example, count the number of requests with a delay between 0 and 10 ms, and the number of requests with a delay between 10 and 20 ms. In this way, the cause of system slowness can be quickly analyzed. Both Histogram and Summary are designed to solve such problems. Through the monitoring indicators of the Histogram and Summary types, we can quickly understand the distribution of monitoring samples.
The number of requests with http request response time <= 0.005 seconds is 10
prometheus_http_request_duration_seconds_bucket{handler="/metrics",le="0.005"} 10
The sum of the sizes of all sample values, named <basename>_sum
prometheus_http_request_duration_seconds_sum{handler="/metrics"} 10.107670803000001
The total number of samples, named <basename>_count, the effect is the same as <basename>_bucket{le="+Inf"}
prometheus_http_request_duration_seconds_count{handler="/metrics"} 20
Note:
Bucket can be understood as a division of the value range of data indicators, and the basis for division should be based on the distribution of data values. Note that the following samples include the previous samples. Assuming that the value of prometheus_http_request_duration_seconds_bucket{...,le="0.01"} is 10, and the value of prometheus_http_request_duration_seconds_bucket{...,le="0.05"} is 30, it means that 10 of the 30 samples are less than 0.01s, and the response time of the remaining 20 sampling points is Between 0.01s and 0.05s.
The sample data generated by the accumulation interval mechanism needs to additionally use the built-in histogram_quantile function to calculate the corresponding quantile (quantile) according to the Histogram index, that is, the proportion of the sample number of a certain bucket in all the sample numbers.
When calculating the quantile, the histogram_quantile function assumes that the samples in each interval satisfy a linear distribution state, so its result is only an estimated value, which is not completely accurate. The accuracy of the estimate depends on the granularity of the bucket interval division; the larger the granularity, the lower the
accuracy
For example, suppose the upper boundary of the 9th quantile (quantile=0.9) of the sample of http request response time is 0.01, which means that the number of sample values less than or equal to 0.01 accounts for 90% of the overall sample value
histogram_quantile(prometheus_http_request_duration_seconds_bucket{handler="/metrics",le="0.01"}) 0.9
(4) Summary type
Histogram is only simple bucket division and bucket counting on the client side, and the quantile calculation is estimated by Prometheus Server based on sample data, so the result may not be accurate, and even unreasonable bucket division will lead to large errors.
Summary is an indicator type similar to Histogram, but it performs statistics on each sampling point within a period of time (10 minutes by default) on the client side, calculates and stores the quantile value, and the server side can directly grab the corresponding value.
For each indicator, Summary is prefixed with the indicator name <basename> to generate the following indicator sequences:
<basename>_sum : counts the sum of all sample values
●<basename>_count: Count the total number of all samples
●<basename>{quantile="x"} : Quantile distribution of statistical sample values, quantile range: 0 ≤ x ≤ 1
From the above sample, we can know that the total number of wal_fsync operations performed by the current Promtheus Server is 216 times, which takes 2.888716127000002s. Among them, the time consumption of the median (quantile=0.5) is 0.012352463s, and the time consumption of the 9th quantile (quantile=0.9) is 0.014458005s.
(5) Similarities and differences between the Histogram type and the Summary type.
Both of them contain the <basename>_sum and <basename>_count indicators. Histogram needs to calculate the quantile through the <basename>_bucket, while the Summary directly stores the quantile value.
3. Syntax application of PromQL
3.1 Basic query
The general expression format of the basic query of Prometheus is <metric name>{label=value}, and the query
(1) Single index empty label query
eg: Query the total number of HTTP requests received
prometheus_http_requests_total
或
prometheus_http_requests_total{}
Note: As long as there is one difference between the label items in {} behind the indicator, it means that it is an independent time series. When the index and label items are exactly the same, it is a time series
The above uses the method of index or index plus empty label to perform a fuzzy query, and the label can also perform a fuzzy query (the label must exist in the form of key-value pairs, if only the key or value does not conform to the syntax of PromQL)
(2) Single label query
eg: View all time series containing labels (instance="localhost:9090")
(3) Combination of indicators and tags to narrow the scope of query
eg: View the total number of http requests (the instance is local, and the code return code is 302)
prometheus_http_requests_total{instance="localhost:9090",code="302"}
3.2 Time Range Query
(1) Real-time time range query
In the above-mentioned basic query case, when we query through <metric name>{label=value}, the returned result will only contain the latest value of the time series. Such a result type is called an instant vector (instant vector). In addition to instantaneous vectors, PromQL also supports returning a set of data in a time series within a certain time range, which is called a range vector .
eg: View the total number of http requests (the instance is the local machine, and the code return code is 302), and the viewing time range is within 1 minute
prometheus_http_requests_total{instance="localhost:9090",code="302"}[1m]
In the above figure, we can see all the sampling results within 1 minute
In addition to using m for minutes, PromQL's time range selector supports other time units:
s - seconds
m - minutes
h - hours
d - day
w - week
y - the year
Note: Prometheus does not support the use of decimal point value ranges. For example, to query a time period of one and a half days, [1.5d] is not used for query, but [36h]
( 2) Time displacement query
In the time series query, in addition to using the current time as the benchmark, offset can also be used to move forward in time.
eg: View the total number of http requests (the instance is the local machine, the code return code is 302), the request is five minutes ago (instantaneous vector)
prometheus_http_requests_total{instance="localhost:9090",code="302"}offset 5m
eg: View the total number of http requests (the instance is the local machine, the code return code is 302), the request is five minutes ago, and the time range is within 1 minute
prometheus_http_requests_total{instance="localhost:9090",code="302"}[1m]offset 5m
3.3 Aggregation operation query
The PromQL language provides many built-in aggregation operators, which are used to aggregate the samples of the instantaneous vector to form a new sequence. The currently supported aggregation operators are as follows:
The aggregate operation syntax format in PromQL can take one of the following two formats:
● <aggregate function> (vector expression) by|without (label)
● <aggregate function> by|without (label) (vector expression)
(1) Calculate the sum of all http requests
sum(prometheus_http_requests_total{})
(2) Find the time series with the largest number of all http requests
max(prometheus_http_requests_total{})
(3) Find the average number of all http requests
avg(prometheus_http_requests_total{})
(4) Use topk to display the top three time series data of value
topk(3,prometheus_http_requests_total{})
(5) Use bottomk to view the time series of the last five places in value
bottomk(5,prometheus_http_requests_total{})
In addition: in the aggregation operation, you can also add without or by to the expression, where without is used to remove the listed labels in the calculation sample, and by is the opposite, only the listed labels are kept in the result vector, and the rest of the labels are removed .
#Calculate the sum of the time series and value values except for the time series containing code, handler, and job in the
label
3.4 Wildcard and regular expression matching label query
The following tags in the PromQL query statement often have a large number of tags with the same key but different values, which are two different time series. Time series can be quickly located through the relationship between value and key, and regular expressions can be used between key and value, and the fields in value can be fuzzy defined by wildcards
PromQL commonly used regular expressions:
(1) Match the time series whose code in the label is not 200
prometheus_http_requests_total{code!="200"}
2) The matching label handler is a time series starting with api
prometheus_http_requests_total{handler=~"/api.+"}
或
prometheus_http_requests_total{handler=~"/api.*"}
3.5 Use of operators
(1) Operators
In PromQL queries, expression operators can also be used to perform more complex result queries. Among them, the addition, subtraction, multiplication, division and other methods used by the data operator calculate the sample value and return the calculated result.
All mathematical operators supported by PromQL are as follows:
+ (addition)
- (subtraction)
* (multiplication)
/ (division)
% (remainder)
^ (exponentiation)
eg: The unit of the memory value obtained through process_virtual_memory_bytes is byte. When we want to convert it to GB, we only need to use the following expression to process.
process_virtual_memory_bytes/(1024*1024*1024)
2) Comparison operators
Comparison operators allow users to filter time series based on the values of time series samples.
The comparison operators supported by Prometheus are as follows:
== (equal)
!= (not equal)
> (greater than)
< (less than)
>= (greater than or equal to)
<= (less than or equal to)
eg: We only want to query the interface data whose Prometheus request volume is greater than 1,000, then we can use the following comparison expressions to filter.
The comparison expression can also be matched with the bool modifier. After adding bool, the expression will no longer filter the data, but return 1 (true) or 0 (false) according to the comparison result.
prometheus_http_requests_total>bool 1000
(3) Logical operators
Logical operators support three types of operations: and, or, and unless (excluded). Among them, and is a union, which is used to match the same result in an expression. unless is just the opposite of and, the matching result will exclude the same samples in the two, and only display the collection where the other part does not contain each other; while or has the widest matching range, it will not only match all the data in expression 1, but also match samples that are not the same in expression 2.
eg: time series samples with http requests between 100 and 1000
prometheus_http_requests_total<1000 and prometheus_http_requests_total>100
Time series samples with http requests greater than 100 or less than 1000
prometheus_http_requests_total<1000 or prometheus_http_requests_total>100
Note: There is a priority among the operators of Prometheus, in which order from high to low is (^) > (*, /, %) > (+, -) > (==, !=, <=, <, >=, >) > (and, unless) > (or) , you need to pay attention to the priority relationship during use to avoid erroneous results.
3.6 Built-in functions
Prometheus has many built-in functions, and through the flexible application of these functions, it is more convenient to query and format data.
ceil function
The ceil function rounds up the value of the returned result to an integer.
Example:
ceil(avg(prometheus_http_requests_total{code="200"}))
floor function
The floor function is the opposite of ceil, it will be rounded down.
Example:
floor(avg(prometheus_http_requests_total{code="200"}))
rate函数
The rate function is the most frequently used and one of the most important functions. rate is used to get the average number of increments per second in a certain time interval, and it will count all data points in the time interval. The rate function is usually applied to indicators of the Counter type to understand the incremental situation.
Example: Get http_request_total within 2 minutes, the average number of new requests per second
rate(prometheus_http_requests_total{handler="/rules"}[1m])
irate函数
Compared to the rate function, irate provides higher sensitivity. The irate function calculates the growth rate of the interval vector through the last two sample data in the time interval, so as to avoid the situation where the average value in the range pulls down the peak value .
Example: the usage of this function is the same as rate
irate(prometheus_http_requests_total{handler="/rules"}[1m])
Other built-in functions
In addition to the functions mentioned above, PromQL also provides a large number of other functions for use. The scope of functions covers the functions required for daily use, such as the label_replace function for label replacement, and the histogram_quantile function for counting the quantile of the Histogram indicator. For more information, please refer to the official document
4. Classic query example
(1) The average CPU usage of each host in the last 5 minutes
(1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance)) * 100
(2) Query whether the load average time series of 1 minute exceeds the number of host CPUs by 2 times
node_load1 > on (instance) 2 * count (node_cpu_seconds_total{mode="idle"}) by (instance)
(3) Calculate host memory usage
Available memory space: the sum of free memory, buffer, and cache indicators
node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes
Used memory space: total memory space minus free space
node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)
Utilization: used space divided by total space
(node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100
4) Calculate the total memory of all nodes and all containers:
sum by (instance) (container_memory_usage_bytes{instance=~"node*"})/1024/1024/1024
(5) Calculate the cpu usage of all containers in the last 1m of the node01 node:
sum (rate(container_cpu_usage_seconds_total{instance="node01"}[1m])) / sum (machine_cpu_cores{instance="node01"}) * 100
#container_cpu_usage_seconds_total represents the sum of the time the container occupies the CPU
(6) Calculate the change rate of cpu usage of each container in the last 5m
sum (rate(container_cpu_usage_seconds_total[5m])) by (container_name)
(7) Query the CPU usage change rate of each Pod in the latest 1m in the K8S cluster
sum (rate(container_cpu_usage_seconds_total{image!="", pod_name!=""}[1m])) by (pod_name) #Because the
queried data is related to the container, it is best to group and aggregate by Pod