table of Contents
Explanation
Based on an article basis, where do some some basic queries about the CPU, memory, disk.
CPU
By querying the metric is node_cpu_seconds_total
to get all the information of the current CPU
Parameters directly through this, will be found in the data is the data currently collected all related to the CPU.
In this case it is necessary to use irate
the function for homeopathic growth distance vector calculations per second in the actual sequence.
According to the label and then screened to query as follows:
irate(node_cpu_seconds_total{job="node"}[5m])
Avg after the polymerization using the data query, and then used to distinguish by example, you can do so respective data sub-query instance.
avg(irate(node_cpu_seconds_total{job="node_srv"}[5m])) by (instance)
The above statement to query data to CPU, CPU contains all the data, and we want to query the CPU load is 5 minutes
Ideas can be like this: check out the CPU idle value mode='idle'
, multiplied by one hundred come after idle percentage, with 100 minus the percentage of CPU idle percentage used to draw
as follows:
100 - ((avg(irate(node_cpu_seconds_total{job="node_srv",mode="idle"}[5m])) by (instance)) * 100)
RAM
About metric value memory are the following:
node_memory_MemTotal_bytes
Total amount of memory on the hostnode_memory_MemFree_bytes
Free memory on the hostnode_memory_Buffers_bytes
Memory Buffer Sizenode_memory_Cached_bytes
Memory cache size
So based on the above metrics, you can make a lot of inquiries.
Calculate the percentage of memory usage:
The idea is to subtract the amount of free memory, and cache buffer zone with a total memory, and then divided by the total memory
(node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100
Finally, multiplied by one hundred to obtain the percentage of
results are as follows:
Disk Monitoring
For disk monitoring, divided into three types:
- Space surveillance
- Disk read and write IO monitor
- Expected saturation monitoring, meaning that the current state of the disk write speed, how long it is expected to disk will be filled
Disk space utilization percentage
According to the label filtering, query a partition space utilization
The principle is: the same partition, with the remaining space obtained by subtracting the total space used space, divided by the total space utilization percentage obtained
(node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100
Expected saturation
Here it involves some algorithms, formulas listed directly.
This formula derived value, when the disk will be filled in the conditions of a certain time period in the future, the value will be a negative number.
predict_linear(node_filesystem_free_bytes{mountpoint="/"}[1h], 4*3600) < 0