When we optimize system performance, we usually use top
commands to view the system load and the running status of various processes in the system, so as to find out the factors that affect system performance. As shown below:
top
top
The command will output a lot of system-related information, such as: system load, number of processes in the system, CPU usage and memory usage, etc., which play a vital role in troubleshooting system performance problems.
This article mainly introduces the meaning and function of the indicators top
in the command iowait
(as shown in the red box in the figure above).
what is iowait
what is iowait
it Let's take a look at the Linux explanation:
Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
The meaning of the Chinese translation is: the percentage of time the CPU is idle (the process is running at this time idle
) while waiting for the completion of the disk I/O request.
It can be seen that if the system is in iowait
state, then the following two conditions must be satisfied:
- There are processes in the system waiting for I/O requests to complete.
- The system is currently idle, meaning there are no processes to run.
iowait statistical principle
Now that we know iowait
the meaning of , let's take a look at how Linux calculates iowait
the ratio.
Linux will iowait
output the occupied time to /proc/stat
the file, we can get iowait
the occupied time with the following command:
cat /proc/stat
The command output is shown in the figure below:
stat
The data in the red box is iowait
the occupied time.
We can read the file every once in a while , and then subtract /proc/stat
the two obtained times, and the result is the time that the CPU is in the state during this period. Then divide it by the total time to get the ratio of the total time spent.iowait
iowait
iowait
Now let's take a look at /proc/stat
how the file is obtained iowait
.
In the kernel, each CPU has a cpu_usage_stat
structure, which is mainly used to count some information of the CPU, and its definition is as follows:
struct cpu_usage_stat {
cputime64_t user;
cputime64_t nice;
cputime64_t system;
cputime64_t softirq;
cputime64_t irq;
cputime64_t idle;
cputime64_t iowait;
cputime64_t steal;
cputime64_t guest;
cputime64_t guest_nice;
};
cpu_usage_stat
The fields of the structure iowait
record iowait
the time the CPU was in the state.
So to get iowait
the total time the system is in the state, you only need to add up the time of all CPUs iowait
, the code is as follows (located in the source file fs/proc/stat.c
):
static int show_stat(struct seq_file *p, void *v)
{
u64 iowait;
...
// 1. 遍历系统中的所有CPU
for_each_possible_cpu(i) {
...
// 2. 获取CPU对应的iowait时间,并相加
iowait = cputime64_add(iowait, kstat_cpu(i).cpustat.iowait);
...
}
...
return 0;
}
show_stat()
The function first iterates over all CPUs, then reads their iowait
times, and adds them together.
Information through train: Linux kernel source code technology learning route + video tutorial kernel source code
Learning through train: Linux kernel source code memory tuning file system process management device driver/network protocol stack
Increase iowait time
From the above analysis, we can see that each CPU has a iowait
counter for counting time, so when will this counter be increased?
The answer is: 系统时钟中断
.
In 系统时钟中断
, the function will be called account_process_tick()
to update the CPU time, the code is as follows:
void account_process_tick(struct task_struct *p, int user_tick)
{
cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
struct rq *rq = this_rq();
// 1. 如果当前进程处于用户态,那么增加用户态的CPU时间
if (user_tick) {
account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
}
// 2. 如果前进程处于内核态,并且不是idle进程,那么增加内核态CPU时间
else if ((p != rq->idle) || (irq_count() != HARDIRQ_OFFSET)) {
account_system_time(p, HARDIRQ_OFFSET, cputime_one_jiffy,
one_jiffy_scaled);
}
// 3. 如果当前进程是idle进程,那么调用account_idle_time()函数进行处理
else {
account_idle_time(cputime_one_jiffy);
}
}
We mainly focus on idle
the fact that the current process is a process. This is account_idle_time()
the function that the kernel will call for processing. The code is as follows:
void account_idle_time(cputime_t cputime)
{
struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
cputime64_t cputime64 = cputime_to_cputime64(cputime);
struct rq *rq = this_rq();
// 1. 如果当前有进程在等待IO请求的话,那么增加iowait的时间
if (atomic_read(&rq->nr_iowait) > 0) {
cpustat->iowait = cputime64_add(cpustat->iowait, cputime64);
}
// 2. 否则增加idle的时间
else {
cpustat->idle = cputime64_add(cpustat->idle, cputime64);
}
}
account_idle_time()
The logic of the function is relatively simple, mainly divided into the following two cases for processing:
- If there are processes currently waiting for I/O requests, then the increased
iowait
time. - The time to increase if no process is currently waiting for an I/O request
idle
.
Therefore, from the above analysis, we can see that iowait
the time to be increased needs to meet the following two conditions:
- The current process is
idle
a process, which means the CPU is idle. - A process is waiting for an I/O request to complete.
Furthermore, when the CPU is in iowait
the state, it means that the CPU is in an idle state, and some processes in the system are blocked because of waiting for I/O requests, which also shows that the utilization rate of the CPU is not sufficient.
At this time, we can use asynchronous I/O (eg iouring
) to optimize the program so that the process will not be blocked by I/O requests.