Prometheus monitors CPU, Memory, Disk

One, cpu use

Start by calculating the rate per second for each CPU mode. PromQL has a function called irate, which is used to calculate the instantaneous growth rate per second of the time series in the distance vector. Let us use the irate function on the ``node_cpu_seconds_total` metric. Enter in the query box:

irate(node_cpu_seconds_total{job="node"}[5m])

avg(irate(node_cpu_seconds_total{job="node"}[5m])) by (instance)


Now, we encapsulate the irate function in avg aggregation and add a by clause, which is aggregated by instance tags. This will generate three new metrics, using values ​​from all CPUs and all modes to average the host's CPU usage.

avg (irate(node_cpu_seconds_total{job="node",mode="idle"}[5m])) by (instance) * 100

Here, we add a mode tag with a value of idle to our query. This only queries idle data. We find the average of the results through an example and multiply it by 100. Now we have an average percentage of idle usage in 5 minutes on each host. We can turn this into a percentage and subtract 100 from this value, like this:

100 - avg (irate(node_cpu_seconds_total{job="node",mode="idle"}[5m])) by (instance) * 100

Now we have three indicators, one indicator for each host, showing the average CPU percentage used in a 5-minute window.

image

image


Second, memory usage

Find them in the list of indicators prefixed with node_memory.

We will focus on a subset of node_memory metrics to provide our utilization metrics:

• node_memory_MemTotal_bytes-total memory on the host

• node_memory_MemFree_bytes-free memory on the host

• node_memory_Buffers_bytes_bytes-the memory in the buffer cache

• node_memory_Cached_bytes_bytes-the memory in the page cache.

All these indicators are expressed in bytes.

(node_memory_MemTotal_bytes-(node_memory_MemFree_bytes+ node_memory_Cached_bytes + node_memory_Buffers_bytes)) 

/ node_memory_MemTotal_bytes * 100

image

3. Disk usage

For disks, we only measure disk usage and not usage, saturation, or errors. This is because in most cases, it is the most useful data for visualization and alerting. Node Exporter's disk usage indicators are located in the list of indicators prefixed by node_filesystem

For example, the node_filesystem_size_bytes indicator shows the size of each file system being monitored. We can use queries similar to memory metrics to generate the percentage of disk space used on the host. However, unlike memory metrics, we have file system metrics for every mount point on every host. So we added the mountpoint tag, especially the root file system "/" mount. This will return disk usage metrics for the file system on each host.

(node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_free_bytes{mountpoint="/"}) 

/ node_filesystem_size_bytes{mountpoint="/"} * 100

image.png

Grafana can be localized, and monitoring templates such as mysql and redis can be imported at the same time


↓↓ Click "Read the original text" [Join the DevOps operation and maintenance team ]

Related Reading:

1. Deploy Prometheus in two ways

2. Deploy Prometheus+Grafana

3. Prometheus+Alertmanager configure email alarm


Please share to the circle of friends, scan the code and follow

image.png


Guess you like

Origin blog.51cto.com/15127516/2657698