Performance analysis Linux server CPU utilization

CPU metrics

1. Index range

1.1 User mode CPU utilization+ System mode CPU utilization

Reasonable value: 60-85%. If the us+sy time exceeds 85% in a multi-user system, the process may take time to wait in the run queue, and response time and business throughput will be compromised; if the us is too large, it means There are user processes that take up a lot of CPU time and need to further analyze other software and hardware factors; sy is too large, indicating that a lot of time has been spent in system management, indicating that a certain subsystem in the system has a bottleneck, and other software and hardware factors need to be further analyzed.

1.2 Wa(wait)

Reference value: A value of wa less than 25% and more than 25% can indicate that the disk subsystem may not be properly balanced, or it may be the result of disk-intensive workloads. There may be problems with the disk or other I/o of the system. You can use iostat /SAR -C command further analysis

1.3 Id(idle)

Reference value: greater than 40, if r is often greater than 4, and id is often less than 40, it means that the load of the cpu is very heavy

1.4 r

Reference value: less than 4, when the queue is greater than 4, it indicates that there may be a problem with the system's cpu or memory. If r is often greater than 4 and id is often less than 40, it means that the cpu is heavily loaded. When the queue becomes longer, the time spent by the processes in the queue waiting for the execution of the CPU scheduling will become longer

1.5 The method of judging the cpu bottleneck

Very slow response time (slow response time)

Cpu idle time is zero (zero percent idle cpu)

Too high users occupy cpu time (high percent user cpu)

Too high system occupies cpu time (high percent system cpu)

For a long time there is a very long running process queue (large run queue size sustained over time)

2. How to check cpu utilization

2.1 Use the top command to view

Data comes from /proc/stat file
Insert picture description here

%us =(User time + Nice time)/CPU时间*100%
%sy=(System time + Hardirq time +Softirq time)/ CPU时间*100%
%id=(Idle time)/CPU时间*100%
%ni=(Nice time)/CPU时间*100%
%wa=(Waiting time)/CPU时间*100%
%hi=(Hardirq time)/CPU时间*100%
%si=(Softirq time)/CPU时间*100%
%st=(Steal time)/CPU时间*100%

Note: By default, the top command refreshes every 3 seconds. You can also specify the refresh frequency through top -d <refresh interval>, such as top -d 0.1 or top -d 0.01. When executing top, you can also press the "s" key to modify the time interval.

2.2 View using vmstat
Insert picture description here

r represents the size of the run queue, b represents the number of threads blocked due to IO waiting, in represents the number of interrupts, and cs represents the number of context switches.

2.3 Other viewing methods

Iostat、sar -q、sar –u等

[Article benefits] C/C++Linux server development/architect learning materials are needed to add technical exchange group: 960994558 I have sorted out some learning books that I think are better, interview questions from big factories, interesting projects and popular technology teaching video materials sharing In it (including C/C++, Linux, Nginx, ZeroMQ, MySQL, Redis, fastdfs, MongoDB, ZK, streaming media, CDN, P2P, K8S, Docker, TCP/IP, coroutine, DPDK, etc.), there is a need You can add it yourself! ~Insert picture description here

3. CPU introduction

3.1 Time in the kernel

HZ is the number of times the system clock sends out clock interrupts in one second. HZ can be configured before compiling the kernel, so you can view the current system clock interrupt frequency through the following command: cat /boot/config- uname -r| grep CONFIG_HZ

Tick ​​is the time of each "tick" of the system clock, and its value is (1/HZ) second. That is, the time interval between two consecutive clock interrupts.

jiffies is used to count the number of ticks since the system was started, that is to say, the value of this variable is increased once every time the system clock generates a clock interrupt.

3.2 CPU time composition

The working time of the CPU consists of three parts: user mode time, system mode time and idle mode time. The specific composition is:

CPU时间包含User time、System time、Nice time、Idle time、Waiting time、Hardirq time、Softirq time、Steal time

空闲态时间==idle time

用户态时间==user time+ Nice time。

内核态时间==system time+ Hardirq time+ Softirq time。

user time。指CPU在用户态执行进程的时间。

system time。指CPU在内核运行的时间。

nice time。指系统花费在调整进程优先级上的时间。

idle time。系统处于空闲期,等待进程运行。

waiting time。指CPU花费在等待I/O操作上的总时间,与blokced相似。

steal time。指当前CPU被强制(involuntary wait )等待另外虚拟的CPU处理完毕时花费的时间,此时 hypervisor 在为另一个虚拟处理器服务。

Softirq time 、Hardirq time。分别对应系统在处理软硬中断时候所花费的CPU时间。

3.3 User mode CPU utilization

%usr. Shows the percentage of CPU time spent in user mode. The processes that users use the CPU include: cpu runs regular user processes, cpu runs niced processes, and cpu runs real-time processes. A linux process can be executed in user mode or in system (kernel) mode. When a process is running in kernel code, we call it in kernel mode; when a process is executing user's own code, we It is said to be in user mode. When executed in user mode, the process is executed in its own application code, without the need for kernel resources to perform calculations, manage memory or set variables

3.4 System mode CPU utilization

Shows the percentage of cpu time spent in system mode, including cpu resources consumed by kernel processes (kprocs) and other processes that need to access kernel resources. Processes that use cpu in the system include: for system calls and for I/O management (Interrupt and driver), used for memory management (paging and swapping), used for process management (context switch and process start), if a process needs kernel resources, it must execute a system call, and thus switch to system mode Make the resource available.

3.5 %wa(wait)

Shows the idle percentage of the cpu of the disk that suspends local disk I/O and NFS loading. It is the ratio of the CPU being idle due to the process waiting for I/O. I/O mainly includes: block I/O, I/O, raw I/O, VM-paging/swapins. If there is at least one outstanding disk I/O while wait is running, the event is classified as I/O waiting time. I/O requests to the disk will cause the calling process to block (or sleep) until the request is completed. Once the process's I/O request is completed, the process is placed in the run queue. If the I/O is completed quickly, the process can use more cpu time.

3.6 %id(idle)

Idle conditions other than WIO above, show the percentage of time that the cpu is idle or waiting when there is no local disk I/O. If there is no thread to execute (the run queue is empty), the system dispatches a thread called wait, which can be called idle kproc. If the ps report shows that the total time of this thread is high, it indicates that there is a time period in which no other threads are ready to run or wait for execution on the cpu. The system is therefore mostly idle or waiting for new tasks.

3.7 r(runq-sz)

The length of the running process queue. For the size of the number of processes in a runnable state, these processes are ready in memory

4. Concept introduction

4.1 User mode + kernel mode

Generally speaking, a process running on the CPU can have two modes of operation, both in user mode and in kernel mode (that is, the process works in user mode and kernel mode, respectively, while working in kernel mode is still This process, unless the process is switched). Usually the operating system divides the virtual address space into user space and kernel space. For example, the virtual address space of the Linux system on the x86 platform is 0x00000000 0xffffffff, the first 3GB (0x00000000 0xbfffffff) is user space, and the last 1GB (0xc0000000~0xffffffff) is the kernel space. The user program is loaded into the user space and executed in user mode. It cannot access the data in the kernel or jump to the kernel code for execution. This can protect the kernel. If a process accesses an illegal address, at most this process will crash without affecting the stability of the kernel and the entire system. When the CPU generates an interrupt or exception, it not only jumps to the interrupt or exception service city west, but also automatically switches the mode, from the user mode to the privileged mode, so the interrupt or exception server program can jump to the kernel code for execution. In fact, the entire kernel is composed of various interrupt and exception handlers. That is, under normal circumstances, the processor executes the user program in the user mode, and in the event of an interrupt or exception, the processor switches to the privileged mode to execute the kernel program, and then returns to the user mode to continue executing the user program after processing the interrupt or exception, for example, user process A The kernel system call is called to obtain the current clock tick count. When the system call instruction in user process A is executed, the current register status such as IP and CS of the current user process will be saved, and then jump to the kernel space (that is, the kernel code area) ) To execute the system call function like Ying to get the current clock ticks. After execution, return to process A through the IRET instruction (that is, reset the information saved at the time of entry to the corresponding register), and then execute the instructions of process A from the CS: EIP address

When a process is created, in addition to creating the control block of the process, the kernel stack of the process is also created in the kernel. After the process enters the kernel through a system call (such as fopen() or open()), the processor is at a privileged level. It is executed in the highest (level 0) kernel code. When the process is in the kernel state, the executed kernel code will use the kernel stack of the current process and point to the context of the process.

The permissions of the kernel mode are higher than those of the user mode.

User level. System users can interact with the operating system, such as running applications and system commands. The user level accesses the kernel level through the system call interface; the kernel level. The operating system automatically runs some functions, they mainly operate on the hardware

4.2 Process scheduling

If any process wants to occupy the CPU, so that it is really in the execution state, it must be scheduled through the process. The process scheduling mechanism mainly involves the scheduling method, scheduling timing and scheduling strategy.

1. Scheduling method

The scheduling method of the Linux kernel basically adopts the "preemptive priority" method, that is, when the process is running in user mode, regardless of whether it is voluntary, under certain conditions (such as time slice exhaustion or waiting for I/O), the core can Temporarily deprive it of operation and schedule other processes to enter operation. However, once the process is switched to run in kernel mode, it will continue to run without the above restrictions, and process scheduling will not occur until it returns to user mode.

The scheduling strategy in the Linux system basically inherits the priority-based scheduling of Unix. In other words, the core calculates a priority for each process in the system, and the priority reflects the qualification of a process to obtain the right to use the CPU, that is, processes with high priority are given priority to run. The core selects a process with the highest priority from the process ready queue, allocates a CPU time slice for it, and puts it into operation. During the running process, the priority of the current process decreases with time, which achieves the "negative feedback" effect: after a period of time, the original lower-level process is relatively "upgraded" and has a chance to run. When the priority of all processes becomes 0, the priority of all processes is recalculated.

2. Scheduling strategy

The Linux system provides three different scheduling strategies for different types of processes, namely SCHED_FIFO, SCHED_RR and SCHED_OTHER.

SCHED_FIFO is suitable for real-time processes. They have relatively strong time requirements, and the time required for each run is relatively short. Once this process is scheduled to start running, it must run until it voluntarily yields the CPU or is given priority. The high process preempts its execution right until it is.

SCHED_RR corresponds to the "time slice rotation method", which is suitable for real-time processes that require a long time each run. A running process is allocated a time slice (such as 200 milliseconds). When the time slice is used up, the CPU is preempted by another process, and the process is sent back to the end of the same priority queue. SCHED_OTHER is a traditional Unix scheduling strategy, suitable for interactive time-sharing processes. The priority of this type of process depends on two factors. One factor is the remaining time quota of the process. If the process runs out of allotted time, the corresponding priority is 0; the other is the process priority number nice, which is inherited from the Unix system The lower the priority number, the higher the priority.

The value range of nice is 19-20. The user can use the nice command to set the nice value of the process. However, general users can only set positive values, thereby actively lowering its priority; only privileged users can set the value of nice to a negative number. The priority of the process is the sum of the above two. The core dynamically adjusts the priority of the user mode process. In this way, a process needs to go through multiple feedback loops from creation to completion of the task. When the process is scheduled to run again, it will continue execution from the last breakpoint. For real-time processes, the priority value is (1000 + set positive value), so at least 1000. Therefore, the priority of real-time processes is higher than that of other types of processes. In addition, the time quota and nice value have nothing to do with the priority of the real-time process. If there are real-time processes in the system in the ready state, the non-real-time processes cannot be scheduled to run until all real-time processes are completed, the non-real-time processes have the opportunity to occupy the CPU.

The background command (with an ampersand at the end of the command, such as gcc f1.c&) corresponds to a background process (also known as a background job), and the priority of the background process is lower than that of any interactive (foreground) process. Therefore, the background process is only scheduled to run when there is no interactive process currently running in the system. Background processes are often scheduled to run in batches.

3. Scheduling timing

The timing of the core process scheduling has the following situations:

(1) The current process calls the system call nanosleep() or pause() to put itself into a sleep state and actively surrender the right to use the CPU for a period of time;

(2) The process is terminated and the use of the CPU is permanently abandoned;

(3) During the execution of the clock interrupt handler, it is found that the current process has been running continuously for too long;

(4) When waking up a sleeping process, it is found that the awakened process is more qualified to run than the current process;

(5) A process changes the scheduling strategy or reduces its own priority (such as the nice command) by executing a system call, thereby causing immediate scheduling.

4. Scheduling Algorithm

The algorithm of process scheduling should be relatively simple in order to reduce system overhead during frequent scheduling. When Linux executes process scheduling, it first searches for all processes in the ready queue, and selects a process with the highest priority and in memory. If there are real-time processes in the queue, the real-time processes will run first. If the process most needed to run is not the current process, then the current process is suspended, and all the machine states involved in its scene are saved, including the program counter and CPU registers, etc., and then the scene is resumed for the selected process.

4.3 User-level threads and kernel-level threads

In many Unix-like systems, such as Linux, FreeBSD, Solaris, etc., the process has always been the smallest unit of operating system kernel calls, and the multi-process model is also adopted for program development. Later, the concept of thread was introduced. There are two concepts of thread:

User-Level Thread (ULT). The application process uses the thread library to create and manage it. It does not implement threads in the kernel. It only simulates multithreading in the user mode. It does not depend on the operating system core, and the operating system kernel does not know the existence of multithreading at all.

Kernel-Level Thread (KLT), also known as kernel-supported thread or lightweight process. It is implemented in the core space. The kernel sets a thread control block in the core space for each thread to register the thread identifier, register value, status, priority and other information of the thread. All operations on the thread, such as Creation, cancellation, and switching are all completed by the corresponding processing program in the kernel through system function calls. The kernel maintains the context switching of processes and threads and thread switching. Unix-like systems are generally implemented by modifying the implementation of the process, which can be used incompletely The process creation method creates a process that shares data space. Under Linux, this system call is clone(), and under FreeBSD it is rfork().

5. Common Misunderstandings

5.1 High Cpu utilization means insufficient cpu resources

When the cpu counter is not in the range, it is not necessarily due to insufficient cpu resources, because other resources will also cause it. For example, when the memory is insufficient, the cpu will be busy with memory management. On the surface, it may be that the utilization of the cpu is 100%.

The above deficiencies are welcome to point out the discussion, friends who feel good, hope to get your support

Guess you like

Origin blog.csdn.net/weixin_52622200/article/details/113058528