Introduction to prometheus function

Introduction to a function

The gauge type data is a randomly changing value, which is not continuously increasing like the counter

1 increase()

The increase function is used in promethes to intercept the increasing value of Counter for a period of time. increase (node_cpu [1m]) =》 In this way, the increment of the total CPU usage time within 1 minute is obtained, and the increment of a CPU in one minute is obtained. Increase and rate are very similar

rate (1m) is the average number of increments per second,
increase (1m) is the total amount of increments taken over an 
example:


increase(node_network_receive_bytes[1m]) 取的是 1分钟内的 增量总量 
rate(node_network_receive_bytes[1m]) 取的是 1分钟内的增量  除以 60秒 每秒数量

2 sum()

sum () acts as a value addition just like the word meaning, sum (increase (node_cpu [1m])) applies a sum to the outside to add all the core values, and get all the cpu in one. Increment in minutes

用法:sum(rate(node_network_receive_bytes[1m]))

3 (instance)

This function can add the sum to the value of the sum. The split instance of the layer according to the specified method represents the machine name.
For example:

sum(increase(node_cpu_seconds_total{mode="idle"}[1m]))by (instance)

4 Use the above function to view the CPU usage

idle represents the CPU idle time


(1-((sum(increase(node_cpu_seconds_total{mode="idle"}[1m])) by (instance))/(sum(increase(node_cpu_seconds_total[1m])) by (instance)))) * 100 

sum(increase(node_cpu{mode="idle"}[1m])) by (instance)  =》 是空闲CPU时间 1分钟的增量 

sum(increase(node_cpu_seconds_total[1m])) by (instance) 是全部CPU时间 1分 钟增量 

5 Use of rate function

(.) rate function is designed so that data type with counter function Using its function is to set in increments ⼀ time period taken per second counter in the period of
examples:
Rate (node_network_receive_bytes [1M])
to You can get the average increment per second in 1 minute

So when we use any counter data type in the future, we always remember to add a rate () or increase () to it without first doing it first.

6 balls ()

Definition: Take the highest value of the first place
Usage: Use of
Gauge type topk (3, count_netstat_wait_connections)
Use of Counter type topk (3, rate (node_network_receive_bytes [20m]))

7 count()

Definition: Add the output numbers whose values ​​meet the conditions. For
example: find the current (or historical) number of machines with TCP waiting count of 200
count (count_netstat_wait_connections> 200)

Two prometheus command format

1 Exact match

The query of the command is based on the original input first using {} to further filter the count_netstat_wait_connections {exported_instance = "log"}

exported_instance indicates that the monitored server "log" is the machine name of a log server

2 Fuzzy matching

count_netstat_wait_connections{exported_instance=~"web.*"}

Display all machines with web in the machine name 
. * Belongs to regular expression
Fuzzy match = ~
Fuzzy does not match! ~

3 Numerical filtering

After the label is filtered, it is the numerical filter. For example, we only want to find the count_netstat_wait_connections {exported_instance = ~ "web. *"}> 200 where the number of wait_connections is greater than 200

Guess you like

Origin www.cnblogs.com/huningfei/p/12717968.html