Prometheus monitoring server memory monitoring
unusual phenomenon
The monitored centos7 uses Prometheus2.5's node_exporter1.6 to always receive WeChat alarm information as follows:
Then log in to the server to query, execute the command free -m, and the information is as follows:
Found: the alarm information is inconsistent with the actual server information;
Process
Query PromQL formula:
ceil(100 - (((node_memory_MemFree_bytes{
job="gtcq-gt-devops-node-exporter"} + node_memory_Buffers_bytes{
job="gtcq-gt-devops-node-exporter"} + node_memory_Cached_bytes{
job="gtcq-gt-devops-node-exporter"}) / node_memory_MemTotal_bytes{
job="gtcq-gt-devops-node-exporter"}) * 100)) > 90
Note: After verification, the above PromQL formula is not applicable to centos7, and this formula verification is applicable to centos6;
Modify the formula
Modify the formula as follows, query the PromQL formula:
ceil((1 - (node_memory_MemAvailable_bytes{
job="gtcq-gt-devops-node-exporter"} / (node_memory_MemTotal_bytes{
job="gtcq-gt-devops-node-exporter"}))) * 100) >90
Note: This formula is still somewhat different from the actual data of the server, but it is close. I hope there is a more accurate monitoring formula to leave a message;