Prometheus monitoring server memory monitoring

Prometheus monitoring server memory monitoring

unusual phenomenon

The monitored centos7 uses Prometheus2.5's node_exporter1.6 to always receive WeChat alarm information as follows:

Then log in to the server to query, execute the command free -m, and the information is as follows:

Found: the alarm information is inconsistent with the actual server information;

Process

Query PromQL formula:

ceil(100 - (((node_memory_MemFree_bytes{
    
    job="gtcq-gt-devops-node-exporter"} + node_memory_Buffers_bytes{
    
    job="gtcq-gt-devops-node-exporter"} + node_memory_Cached_bytes{
    
    job="gtcq-gt-devops-node-exporter"}) / node_memory_MemTotal_bytes{
    
    job="gtcq-gt-devops-node-exporter"}) * 100)) > 90

Note: After verification, the above PromQL formula is not applicable to centos7, and this formula verification is applicable to centos6;

Modify the formula

Modify the formula as follows, query the PromQL formula:

ceil((1 - (node_memory_MemAvailable_bytes{
    
    job="gtcq-gt-devops-node-exporter"} / (node_memory_MemTotal_bytes{
    
    job="gtcq-gt-devops-node-exporter"}))) * 100) >90 

Note: This formula is still somewhat different from the actual data of the server, but it is close. I hope there is a more accurate monitoring formula to leave a message;

Guess you like

Origin blog.csdn.net/qq_31555951/article/details/109068097