Linux performance optimization practice - CPU context

CPU context switch

Linux is a multitasking operating system, which supports the simultaneous operation of tasks far greater than the number of CPUs. These tasks do not run in parallel in the true sense, but the system allocates CPUs to them in turn in a short period of time, creating the illusion that the tasks are running at the same time.
The CPU needs to know where the task is loaded from, and where to start running is done through the set CPU registers and the program counter (Program Counter, PC).

  • CPU registers: CPU built-in memory with small capacity but extremely fast speed;
  • Program counter: used to store the position of the instruction being executed by the CPU, or the position of the next instruction to be executed. They are all dependent environments that the CPU must run before running any tasks, so they are also called CPU contexts.
    insert image description here
    CPU context switching is to save the CPU context (CPU register and program counter) of the previous task first, then load the context of the new task to these registers and program counter, and finally jump to the new location pointed by the program counter, run new task.
    Processes and threads are the most common tasks. In addition, there are hardware signals that cause interrupt handlers to be called, which is also a common task.
    According to different tasks, CPU context switching can be divided into several different scenarios, namely, process context switching, thread context switching, and interrupt context switching.

Process context switch.

According to the privilege level, Linux divides the running space of the process into kernel space and user space, and the CPU privilege level is Ring 0 and Ring 3.

  • Kernel space (Ring 0) has the highest authority and can directly access all resources;
  • User space (Ring 3) can only access restricted resources, and cannot directly access hardware devices such as memory. It must be trapped in the kernel through system calls to access these privileged resources.
    insert image description here
    Processes can run in both user space and kernel space. When a process runs in user space, it is called the user mode of the process, and when it falls into the kernel space, it is called the kernel mode of the process.

The transition from user mode to kernel mode needs to be done through system calls. For example, when we view the content of a file, we need multiple system calls to complete: first call open() to open the file, then call read() to read the file content, and call write() to write the content to standard output, and finally Call close() to close the file.

  1. Has there been a CPU context switch in the process of the system call?

The answer is of course yes.

The original user mode instruction position in the CPU register needs to be saved first. Next, in order to execute kernel-mode code, the CPU registers need to be updated with the new locations of kernel-mode instructions. The last thing is to jump to the kernel mode to run the kernel task. After the system call ends, the CPU registers need to restore the original saved user state, and then switch to the user space to continue running the process. Therefore, in the process of a system call, there are actually two CPU context switches.

During the system call process, it does not involve process user state resources such as virtual memory, nor does it switch processes. This is different from what we usually call process context switching:

  • Process context switching refers to switching from one process to another process.
  • The same process is always running during the system call
    . Therefore, the system call process is usually called a privileged mode switch, not a context switch . But in fact, during the system call process, the context switch of the CPU is still unavoidable.
  1. So, what is the difference between a process context switch and a system call?

First of all, processes are managed and scheduled by the kernel, and process switching can only occur in the kernel state. Therefore, the context of a process includes not only user space resources such as virtual memory, stacks, and global variables, but also the state of kernel space such as kernel stacks and registers.

Therefore, the context switch of the process is one more step than the system call: before saving the kernel state and CPU registers of the current process, the virtual memory, stack, etc. of the process need to be saved; and the kernel state of the next process is loaded After that, the virtual memory and user stack of the process need to be refreshed.

As shown in the figure below, the process of saving and restoring the context is not "free", and requires the kernel to run on the CPU to complete.
insert image description here
Each context switch requires tens of nanoseconds to several microseconds of CPU time. This time is still quite considerable, especially in the case of a large number of process context switches, it is easy to cause the CPU to spend a lot of time on saving and restoring resources such as registers, kernel stacks, and virtual memory, which greatly shortens the actual execution time. The time of the process. This is exactly what we said in the previous section, an important factor that leads to an increase in the average load.

Linux uses TLB (Translation Lookaside Buffer) to manage the mapping relationship between virtual memory and physical memory. When the virtual memory is updated, the TLB also needs to be refreshed, and memory access will also slow down. Especially on a multi-processor system, the cache is shared by multiple processors. Refreshing the cache will not only affect the process of the current processor, but also affect the processes of other processors that share the cache.

  1. When will the process context be switched?

The context needs to be switched only when the process is switched. In other words, the context needs to be switched only when the process is scheduled. Linux maintains a ready queue for each CPU, sorts active processes (that is, processes that are running and waiting for the CPU) according to priority and waiting time for the CPU, and then selects the process that needs the most CPU, that is, the process with the highest priority and Wait for the process with the longest CPU time to run.

  1. When will the process be scheduled to run on the CPU?

That is, the process is terminated after execution, and the CPU it used before will be released. At this time, a new process will be taken from the ready queue to run. In fact, there are many other scenarios that will also trigger process scheduling. Here I will sort out them one by one for you.

First, in order to ensure that all processes can be scheduled fairly, the CPU time is divided into time slices, and these time slices are allocated to each process in turn. In this way, when the time slice of a certain process is exhausted, it will be suspended by the system and switched to run by other processes waiting for the CPU.

Second, when the system resources are insufficient (such as insufficient memory), the process cannot run until the resources are satisfied. At this time, the process will also be suspended, and the system will schedule other processes to run.

Third, when a process actively suspends itself through methods such as the sleep function, it will naturally be rescheduled.

Fourth, when a process with a higher priority is running, in order to ensure the running of the high-priority process, the current process will be suspended and run by the high-priority process.

Fifth, when a hardware interrupt occurs, the process on the CPU will be suspended by the interrupt, and then execute the interrupt service program in the kernel.

thread context switch

The biggest difference between a thread and a process is that a thread is the basic unit of scheduling, while a process is the basic unit of resource ownership .

The so-called task scheduling in the kernel actually schedules threads; the process only provides resources such as virtual memory and global variables for threads. Threads and processes can be understood in this way:

When a process has only one thread, it can be considered that the process is equal to the thread.

When a process has multiple threads, these threads share the same resources such as virtual memory and global variables. These resources do not need to be modified during context switches.

In addition, threads also have their own private data, such as stacks and registers, which also need to be saved during context switching

Thread context switching can actually be divided into two situations:

First, the two threads before and after belong to different processes. At this time, because resources are not shared, the switching process is the same as process context switching.

Second, the two threads before and after belong to the same process. At this time, because virtual memory is shared, resources such as virtual memory remain unchanged during switching, and only data that is not shared, such as private data and registers of threads, need to be switched

Thread switching within the same process consumes less resources than switching between multiple processes, and this is also an advantage of multi-threading instead of multi-process.

interrupt context switch

In order to quickly respond to hardware events, interrupt processing will interrupt the normal scheduling and execution of the process, and instead call the interrupt handler to respond to device events. When interrupting other processes, it is necessary to save the current state of the process, so that after the interruption is over, the process can still resume from the original state.

Unlike process context, interrupt context switching does not involve the user mode of the process. Therefore, even if the interrupt process interrupts a process in user mode, there is no need to save and restore user mode resources such as virtual memory and global variables of this process.

The interrupt context actually only includes the state necessary for the execution of the kernel mode interrupt service program, including CPU registers, kernel stack, hardware interrupt parameters, etc. For the same CPU, interrupt processing has a higher priority than process , so interrupt context switching does not happen at the same time as process context switching. In the same way, because interrupts will interrupt the scheduling and execution of normal processes, most interrupt handlers are short and concise, so that the execution can end as quickly as possible. In addition, like process context switching, interrupt context switching also consumes CPU, and too many switching times will consume a lot of CPU, and even seriously reduce the overall performance of the system. Therefore, when you find that there are too many interrupts, you need to pay attention to whether it will cause serious performance problems to your system.

How to check the context switching status of the system

Use vmstat (vmstat command is the most common Linux/Unix monitoring tool, which can display the status value of the server at a given time interval, including the server's CPU usage, memory usage, virtual memory swap, IO read and write) this tool, To query the context switching situation of the system,

vmstat installation
How to install apt-get
Manually Installing the vmstat collector on Mac OS

vmstat 5The command indicates to output a set of data every 5 seconds, and the meanings of the data parameter items are as follows:

  • r (Running or Runnable): The length of the ready queue, that is, the number of processes running and waiting for the CPU;
  • b (Block): the number of processes in the uninterruptible sleep state;
  • swpd (swap daemon): the number of swaps from disk to memory per second, using the virtual memory size;
  • free: the size of free physical memory;
  • buffer: the size of the buffer between the device and the device;
  • cache: buffer size between cpu and memory;
  • si (swap input): The size of the virtual memory read from the disk per second. If this value is greater than 0, it means that the physical memory is not enough or the memory is leaked. It is necessary to find the memory-consuming process and solve it;
  • so (swap output): The size of the virtual memory written to the disk per second. If this value is greater than 0, it means that the virtual memory is not enough or the memory leaks;
  • bi (block input): The number of blocks received by the block device per second. The block device here refers to all disks and other block devices on the system. The default block size is 1024byte. If there is no IO operation, it will always be 0.
  • bo (block output): The number of blocks sent by the block device per second. For example, when we read a file, bo must be greater than 0. bi and bo are generally close to 0, otherwise the IO is too frequent and needs to be adjusted.
  • in (interrupt): the number of interrupts per second;
  • cs (context switch): the number of context switches per second;
  • us (user time): user CPU time;
  • sy (system time): system CPU time, if it is too high, it means that the system call time is long, such as frequent IO operations;
  • id (idle time): idle CPU time, generally speaking, id + us + sy = 100, id is idle CPU usage, us is user CPU usage, sy is system CPU usage;
  • wa (wait time): waiting for IO CPU time;
  • st (steal time): the time the virtual CPU waits for the physical CPU;

vmstat gives the overall context switching situation of the system. If you want to view the details of each process, you need to use pidstat.
Enter the command pidstat -w 5to view the context switching of each process.

  • cswch: Indicates the number of voluntary context switches per second;
  • nvcswch: Indicates the number of non-voluntary context switches per second;

The so-called voluntary context switching refers to the context switching caused by the inability of the process to obtain the required resources . For example, voluntary context switching occurs when system resources such as I/O and memory are insufficient. Involuntary context switching refers to the context switching that occurs because the process is forced to be scheduled by the system due to reasons such as the time slice has expired . For example, when a large number of processes are competing for the CPU, involuntary context switches are prone to occur.

to be continued

Reference link:
1. https://zhuanlan.zhihu.com/p/406497025

Guess you like

Origin blog.csdn.net/zkkzpp258/article/details/131621204