Linux system load average

1. What is the average load of the system

The average system load refers to the number of processes in the system in a runnable state and an uninterruptible state per unit time, that is, the average number of active processes, and it has no direct relationship with the CPU usage rate.
A process in a runnable state refers to a process that is using the CPU or is waiting to use the CPU, that is, the process that is in the R state (Running or Runnable) that we often see with the ps command.
A process in an uninterruptible state refers to a process that is waiting for some I/O, that is, a process in the D state (Uninterruptible Sleep, also known as Disk Sleep) that we see in the ps command. For example, waiting for disk I/O, when a process reads and writes data to the disk, in order to ensure the consistency of the data, it cannot be interrupted by other processes before getting the disk reply. At this time, the process is in an uninterruptible state. If the process is interrupted at this time, it is prone to inconsistency between the disk data and the process data. Therefore, the uninterruptible state is actually a protection mechanism of the system to processes and hardware devices.

2. When the average load of the system is reasonable

The load average of the three time periods given by the uptime command is not standardized because the number of CPU cores in the system is uncertain. So an average load of 1 means that the system with one CPU core is always busy, and on a system with 4 CPU cores, it means that the system is idle 75% of the time.
Therefore, when the average system load divided by the number of CPU cores is less than or equal to 1, it means that the system is not overloaded. The ideal situation is that there is exactly one process running on each CPU, so that the CPU is fully utilized.
Which number shall we use as the average load of the system calculated in the three time periods? One minute? five minutes? Or fifteen minutes? We should focus on the average value of five minutes or fifteen minutes. If the load situation of the previous minute is 1.00, then it can still be said that the server situation is still normal, but if the value of fifteen minutes is still maintained at 1.00, then it is worth it Attention.
In addition, reading the file /proc/loadavg can directly view the average load of the system.
Except for the first 3 numbers that represent the average load of the system, the next fraction, the denominator represents the total number of system processes, and the numerator represents the number of running processes; the last number represents the ID of the most recently running process.

3. The relationship between average load and CPU usage

In daily use, we often confuse average load and CPU usage. Here we make a distinction.
The load average refers to the number of processes in the system in a runnable state and an uninterruptible state per unit of time. Therefore, it not only includes the processes that are using the CPU, but also includes the processes waiting for the CPU and waiting for I/O.
The CPU usage rate is the statistics of the CPU busyness per unit time, and it does not necessarily correspond to the average load. For example:
(1) CPU-intensive processes, using a large number of CPUs will lead to an increase in average load, at this time the two are the same;
(2) I/O-intensive processes, waiting for I/O will also lead to an increase in average load , But the CPU usage is not necessarily very high;
(3) There are a large number of processes waiting for CPU calls, which will also cause the average load to increase, and the CPU usage at this time will also be relatively high.

Guess you like

Origin blog.csdn.net/qq_34939308/article/details/112133763