When the load of the LINUX system increases, some basic content should be understood

1. What is the average load?

    Every time you find that the system is slowing down, the first thing you usually do is to execute the top or uptime command to understand the load of the system. For example, after typing uptime, the system will return a row of results as follows

    The 3 numbers following Load average are the average load we are talking about, which respectively represent the average load (Load Average) in the past 1 minute, 5 minutes, and 15 minutes.

  The average load refers to the average number of processes that the system is in a runnable and uninterruptible state per unit time, that is, the average number of active processes, which is not directly related to the CPU usage .

  A runnable process refers to a process that is using the CPU or waiting for the CPU, that is, a process in the R state (Running or Runnable) that we often see with the ps command.

 A process in an uninterruptible state refers to a process that is in a key process in the kernel state, and these processes are uninterruptible. For example, the most common is to wait for the I/O response of the hardware device, which is what we see in the ps command D state (Uninterruptible Sleep, also known as Disk Sleep) process. The uninterruptible state is actually a protection mechanism of the system for processes and hardware devices.

   The load average is actually the average number of active processes. The average number of active processes is intuitively understood as the number of active processes per unit time, but it is actually the exponential decay average of the number of active processes. You don't have to worry about the detailed meaning of this "exponential decay average". This is just a faster calculation method of the system, and it's okay if you directly use it as the average of the number of active processes.

So, what is the reasonable value of the average load?

    Ideally the load average is equal to the number of CPUs. So when judging the load average, you first need to know how many CPUs the system has. The average values ​​of the three different time intervals displayed by Uptime actually provide us with a data source for analyzing system load trends, allowing us to understand the current load status in a more comprehensive and three-dimensional manner.

If the three values ​​of 1 minute, 5 minutes, and 15 minutes are basically the same, or have little difference, it means that the system load is very stable.

But if the value of 1 minute is much smaller than the value of 15 minutes, it means that the load of the system has decreased in the last 1 minute, but there was a heavy load in the past 15 minutes.

Conversely, if the value of 1 minute is much greater than the value of 15 minutes, it means that the load in the last 1 minute is increasing. This increase may only be temporary, or it may continue to increase, so continuous observation is required. Once the 1-minute average load approaches or exceeds the number of CPUs, it means that the system is overloaded. At this time, it is necessary to analyze and investigate where the problem is caused, and find a way to optimize it.

In the actual production environment, when the average load is high, we need to focus on it?

   The recommended method is to monitor the average load of the system, and then judge the change trend of the load based on more historical data. When it is found that the load has an obvious upward trend, for example, the load has doubled, it is necessary to analyze and investigate in time.

Relationship between load average and CPU usage?

  The average load represents the number of active processes. If the average load is high, the CPU usage may not necessarily be high. The number of active processes includes not only the processes that are using the CPU, but also the processes waiting for the CPU and I/O. The CPU usage is the statistics of the CPU busyness per unit time, which does not necessarily correspond to the average load. There are three situations:

A CPU-intensive process that uses a large amount of CPU will lead to an increase in the average load, and the two are consistent at this time;

I/O-intensive processes, waiting for I/O will also lead to an increase in the average load, but the CPU usage is not necessarily high;

The scheduling of a large number of processes waiting for the CPU will also lead to an increase in the average load, and the CPU usage will be relatively high at this time.

Two, CPU context switching

Linux is a multitasking operating system, which supports the simultaneous operation of tasks far greater than the number of CPUs. Of course, these tasks are not actually running at the same time, but because the system allocates the CPU to them in turn in a short period of time, creating the illusion that multiple tasks are running at the same time.

   Before each task runs, the CPU needs to know where the task is loaded and started to run, that is to say, the system needs to set the  CPU registers and program counter (Program Counter, PC) for it in advance.

   The CPU register is a small-capacity but extremely fast memory built into the CPU. The program counter is used to store the position of the instruction being executed by the CPU, or the position of the next instruction to be executed. They are all dependent environments that the CPU must run before running any tasks, so they are also called  CPU contexts.

 

Instruction Register (IR) Program Counter (PC) Address Register (AR) Data Register (AC) Accumulate Register (AC)

CPU context switching is to save the CPU context (that is, CPU register and program counter) of the previous task, then load the context of the new task to these registers and program counter, and finally jump to the new location pointed by the program counter , to run the new task.

CPU context switching can be divided into several different scenarios, namely process context switching, thread context switching and interrupt context switching.

Process context switch :

According to the privilege level, Linux divides the running space of the process into kernel space and user space, corresponding to Ring 0 and Ring 3 of the CPU privilege level in the figure below. Processes can run in both user space and kernel space. When a process runs in user space, it is called the user mode of the process, and when it falls into the kernel space, it is called the kernel mode of the process.

Kernel space (Ring 0) has the highest authority and can directly access all resources;

User space (Ring 3) can only access restricted resources, and cannot directly access hardware devices such as memory. It must be trapped in the kernel through system calls to access these privileged resources.

The transition of a process from user mode to kernel mode needs to be completed through system calls. For example, when we view the content of a file, we need multiple system calls to complete: first call open() to open the file, then call read() to read the file content, and call write() to write the content to standard output, and finally Call close() to close the file.

Process context switching refers to switching from one process to another process running on the cpu. Processes are managed and scheduled by the kernel, and process switching can only occur in the kernel state. Therefore, the context of a process includes not only user space resources such as virtual memory, stacks, and global variables, but also the state of kernel space such as kernel stacks and registers.

 The context needs to be switched only when the process is switched. In other words, the context needs to be switched only when the process is scheduled. Linux maintains a ready queue for each CPU, sorts active processes (that is, processes that are running and waiting for the CPU) according to priority and waiting time for the CPU, and then selects the process that needs the most CPU, that is, the process with the highest priority and Wait for the process with the longest CPU time to run.

When will the process be scheduled to run on the CPU?

In order to ensure that all processes can be scheduled fairly, the CPU time is divided into time slices, and these time slices are allocated to each process in turn. In this way, when the time slice of a certain process is exhausted, it will be suspended by the system and switched to run by other processes waiting for the CPU.

When the system resources are insufficient (such as insufficient memory), the process cannot run until the resources are satisfied. At this time, the process will also be suspended, and the system will schedule other processes to run.

When a process actively suspends itself through a method such as the sleep function sleep, it will naturally be rescheduled.

When there is a process with a higher priority running, in order to ensure the running of the high priority process, the current process will be suspended and run by the high priority process.

When a hardware interrupt occurs, the process on the CPU will be suspended by the interrupt, and the interrupt service routine in the kernel will be executed instead.

What we need to know about CPU context switching is :

CPU context switching is one of the core functions to ensure the normal operation of the Linux system. Generally, we don't need to pay special attention to it.

However, excessive context switching will consume CPU time on saving and restoring data such as registers, kernel stacks, and virtual memory, thereby shortening the actual running time of the process and causing a significant drop in overall system performance.

Types of context switching can be divided into voluntary context switching and involuntary context switching:

Use vmstat and pidstat to observe system CPU context switching

Such as vmstat 1 5

cs (context switch) is the number of context switches per second.

in (interrupt) is the number of interrupts per second.

r (Running or Runnable) is the length of the ready queue, that is, the number of processes that are running and waiting for the CPU.

b (Blocked) is the number of processes in an uninterruptible sleep state.

To view the details of the process, you can use pidstat -w 5

 cswch, the number of voluntary content switches per second

nvcswch, indicating the number of involuntary context switches (no voluntary content switches) per second

Note: There are more voluntary context switches, indicating that processes are waiting for resources, and other problems such as I/O and memory may have occurred; more involuntary context switches indicate that processes are being forced due to time slices and other reasons Scheduling, that is, all competing for the CPU, shows that the CPU has indeed become a bottleneck.

3. The life cycle state of the process

 

 After a process is forked out, it enters the ready state;

When it is scheduled to obtain CPU execution, it enters the execution state;

If the time slice is exhausted or occupied, enter the ready state;

When the resources are not satisfied, it enters the sleep state (deep sleep or light sleep). For example, a network program is waiting for the other party to send a packet. At this time, it cannot occupy the CPU and enters the sleep state. When the packet is sent, the process is awakened. enter the ready state;

If it is suspended, it enters the stop state; after the execution is completed, the resource is released, and the parent process wait function has not received its signal yet, and it will enter the dead state.

That is, the states that may be involved in the entire cycle are: ready state, execution state, dead state, stop state, and sleep state .

4. Relevant system calls

Why are there system calls?

The kernel in the LINUX system is divided into user mode and kernel mode. In the user mode, the program cannot directly access the kernel data structure or kernel program, and can only be accessed in the kernel mode. The process requesting the kernel service uses the special mechanism of the system call. Each system call sets a set of parameters to identify the process request, and completes the transition from the user state to the kernel state by executing CPU instructions. System calls are a way for applications to interact with the kernel. The system call is an interface through which the application program can enter the operating system kernel to use various resources provided by the kernel, such as operating hardware, switching interrupts, changing privileged modes, and so on.

Common system calls:

Control hardware: such as write/read calls.

Set system state or read kernel data - getpid(), getpriority(), setpriority(), sethostname()

Process management: such as fork(), clone(), execve(), exit(), etc.

System call processing

When we call a system call function in a user mode application, the entire processing process from user mode to kernel mode is hidden behind it, as shown in the following figure:

In the figure above, the instruction int $0x80 changes the execution mode of the user mode to the kernel mode, and transfers control to the system_call() processing function, which is the starting point of the system call process.

system_call() checks the system call number, which tells the kernel which service the process is requesting.

The kernel process checks the system call table (sys_call_table) to find the entry address of the called kernel function.

Then call the corresponding function, do some system checks after returning, and finally return to the process.

xyz(): system call library function

system_call: system call handler

sys_xyz(): system call service program

More than 300 system calls have been implemented in the Linux system, such as our commonly used read, write, fork, time, etc. Regarding which system calls are defined in the Linux kernel, you can find all the system calls supported by an installed system. Check it out in the /usr/include/bits/syscall.h file.

Guess you like

Origin blog.csdn.net/vincent0920/article/details/130584348