Linux Performance Optimization--Performance Tools-System CPU

2.0.Overview

This chapter provides an overview of system-level Linux performance tools. These tools are your first line of defense when tracking down performance issues. They can show how the entire system is performing and which parts are performing poorly.
1. Understand basic indicators of system-level performance, including CPU usage.
2. Understand which tools can retrieve these system-level performance metrics.

2.1CPU performance statistics

In order not to explain the meaning of the statistics multiple times (once for each tool), we explain this information once before describing all tools.

2.1.1 Run queue statistics

If processes are runnable and waiting to use the processor, they form a run queue. The longer the run queue, the more processes are waiting. Performance tools usually give the number of runnable processes and the number of blocked processes waiting for I/O. Another common system statistic is load average. The load of a system is the total number of running and runnable processes. For example, if there are two running processes and three runnable processes, the system load is 5. Load average is the amount of load in a given period of time. Generally, the average load times are 1 minute, 5 minutes and 15 minutes. This allows you to observe how the load changes over time.

2.1.2 Context switching

Most modern processors can only run one process or thread at a time. Although some processors (such as hyper-threaded processors) can actually run multiple processes at the same time, Linux treats them as multiple single-threaded processors. To create the illusion that a given single processor is running multiple tasks simultaneously, the Linux kernel must constantly switch between different processes. This kind of switching between different processes is called a context switch, because when it occurs, the CPU has to save all the context information of the old process and take out all the context information of the new process. The context contains a large amount of information that Linux tracks new processes, including: the instructions the process is executing, the memory allocated to the process, files opened by the process, etc. These context switches involve the movement of a large amount of information, so the overhead of the context switch can be considerable. It's a good idea to minimize the number of context switches.

To avoid context switches, it's important to understand how they occur. First, context switches can be the result of kernel scheduling. To ensure a fair allocation of processor time to each process, the kernel periodically interrupts running processes and, under appropriate circumstances, the kernel scheduler decides to start another process rather than allowing the current process to continue executing. Every time such a periodic or scheduled interrupt occurs, your system may perform a context switch. The number of scheduled interrupts per second is related to the architecture and kernel version. A simple way to check the frequency of interrupts is to use the /proc/interrupts file, which can determine the number of interrupts that occur within a known period of time. As shown in Listing 2.1.
Insert image description here

Insert image description here

In Listing 2.1, we ask the kernel to give the number of timer starts, wait 10 seconds, and then request again. This means that on this machine the timer starts at a frequency of (24070093-24060043) interrupts/(10 seconds) or about 1000 interrupts/second. If you have significantly more context switches than timer interrupts, then these switches are most likely caused by I/O requests or other long-running system calls (such as sleep). When an application requests an operation that cannot be completed immediately, the kernel initiates the operation, saves the requesting process, and attempts to switch to another ready process. This keeps the processor as busy as possible.

2.1.3 Interruption

Additionally, the processor periodically receives interrupts from hardware devices. These interrupts are usually triggered when the device has an event that needs to be processed by the kernel. For example, if the disk controller has just finished fetching a block of data from the drive and is ready to provide it to the kernel, the disk controller will trigger an interrupt. For each interrupt received by the kernel, if there is a corresponding registered interrupt handler, the program will be run, otherwise the interrupt will be ignored. These interrupt handlers have a high priority in the system and usually execute very quickly. Sometimes, the interrupt handler has work to do, but does not need high priority, so it can start the "bottom half" (bottom half), which is the so-called soft interrupt handler. If there are many interrupts, the kernel will spend a lot of time servicing them. Looking at the /proc/interrupts file can show which interrupts were triggered on which CPUs.
Insert image description here

2.1.4 CPU usage

CPU usage is a simple concept. At any given time, the CPU can be doing one of the following seven things:
(1) The CPU can be idle, which means the processor is not actually doing any work and is waiting for a task to be performed.
(2) The CPU can run user code, that is, the specified "user" time.
(3) The CPU can execute application code in the Linux kernel, which is the "system" time.
(4) The CPU can execute user code that is "friendly" or whose priority is set lower than that of the general process.
(5) The CPU can be in the iowait state, that is, the system is waiting for I/O (such as disk or network) to complete.
(6) The CPU can be in the irq state, that is, it is processing hardware interrupts with high-priority code.
(7) The CPU can be in softirq mode, that is, the system is executing kernel code that is also triggered by interrupts, but it runs at a lower priority (lower half of the code).
Most performance tools express these values ​​as a percentage of total CPU time. These times range from 0% to 100%, but all three add up to 100%. A system with a high System percentage indicates that most of its time is spent on the core. Tools like oprofile can help determine where time is being spent. Systems with high "user" time spend most of their time running applications. The next chapter shows how to use performance tools to track down the problem in the above situation. If the system spends a lot of time in the iowait state when it should be working, it is likely waiting for I/O from the device. The cause of the slowdown could be the disk, network card, or other device.

2.2Linux performance tools: CPU

Let's now discuss performance tools that allow us to extract the information described previously.

2.2.1vmstat (virtual memory statistics)

vmstat refers to virtual memory statistics. The name indicates that it can tell you the virtual memory performance information of the system. Luckily, it can actually do a lot more than that. vmstat is a very useful command that can obtain rough information about the entire system performance, including:
1. The number of running processes.
2.CPU usage.
3.The number of interrupts received by the CPU.
4. The number of context switches performed by the scheduler.
It is an excellent tool for getting an overview of your system's performance.

2.2.1.1CPU performance related options

vmstat can be called from the following command line: vmstat [-n] [-S] [delay [count]]. vmstat runs in two modes: sampling mode and averaging mode. If no parameters are specified, vmstat statistics run in average mode, and vmstat displays the average of all statistical data since the system was started. However, if a delay is specified, the first sample is still the average since system startup, but then vmstat samples the system by delay seconds and displays the statistics. Table 2-1 explains the vmstat options.
Insert image description here

The various statistical output information provided by vmstat allows you to track different aspects of system performance. Table 2-2 explains the output related to CPU performance. The next chapter describes the output related to memory performance.
Insert image description here

vmstat provides a good low-overhead view of system performance. Since all performance statistics are presented in text form and printed to standard output, it is convenient to capture the data generated during the test and later process and plot it. Because vmstat has such low overhead, having it running continuously on the console or in a window is practical when you need to monitor system health at a glance, even on very heavily loaded servers.

2.2.1.2 Usage examples

As shown in Listing 2.2, if vmstat is run without command line parameters, what is displayed will be the average of the statistics it has recorded since the system was started. According to us, sy, wa and id under the "CPU Usage" column, this example shows that the system has been basically idle since startup. From boot, the CPU spends 5% of its time executing user application code, 1% of its time executing system code, and the remaining 94% of its time being idle.
Insert image description here
Insert image description here

Although vmstat can help determine system load by starting statistics from system startup, vmstat is most useful when running in sampling mode, as shown in Listing 2.3. In sampling mode, vmstat outputs system statistics at intervals specified by the delay parameter, and the number of samples is given by count. The statistics in the first line of Listing 2.3 are the same as before, the average since the system was started, but then sampled periodically. This example shows very little system activity. By looking at the 0's under column b, we know that no processes were blocked at runtime. By looking at the r column, we can also see that the number of running processes was less than 1 when vmstat sampled the data.
Insert image description here
Insert image description here

vmstat is a great way to record the behavior of your system under certain load or test conditions. You can use vmstat to display the behavior of the system, and use the Linux tee command to output the results to a file. (Chapter 8 describes the tee command in detail.) If you pass only the delay parameter, vmstat will sample infinitely. Start vmstat before the test starts and terminate vmstat after the test ends. The output file can be in the form of a spreadsheet and can be used to see how the system reacts to load and various system events. Listing 2.4 shows the output of this approach. In this example, we can see the interrupts and context switches that occurred on the system. The total number of interrupts and context switches can be viewed in the in column and cs column respectively. The number of context switches is less than the number of interrupts. The scheduler switches processes less often than the timer interrupt fires. This is most likely because the system is basically idle, and most of the time the timer interrupt fires, the scheduler doesn't have any work to do, so it doesn't need to switch out of the idle process.
Insert image description here
Insert image description here

The latest version of vmstat can even extract more detailed information on various system statistics, as shown in Listing 2.5. The next chapter discusses memory statistics, but now let's look at CPU statistics. The first set of data, "CPU ticks", shows the CPU time since the system started, where "tick" is a time unit. Although the streamlined vmstat output only shows four CPU states - us, sy, id, and wa, here the distribution of all CPU ticks is shown. Additionally, we can see the total number of interrupts and context switches. A new addition is forks, which roughly represents the number of new processes that have been created since system startup.
Insert image description here
Insert image description here
Insert image description here

vmstat provides a wealth of information about Linux system performance. It is one of the core tools when investigating system problems.

2.2.2 top(version 2.0.x)

top is the Swiss Army Knife of Linux system monitoring tools. It's good at putting quite a bit of overall system performance information on one screen. The display can also be changed interactively, so if a particular problem keeps popping up while the system is running, you can modify the information displayed on top. By default, top appears as a list of the processes that consume the most CPU in descending order. This allows you to quickly find out which program is monopolizing the CPU. top updates this list periodically according to the specified delay (its initial value is 3 seconds).

2.2.2.1CPU performance related options

top is called with the following command line: top [d delay] [C] [H] [i] [n iter] [b]top actually has two modes of options: command line options and runtime options. Command line options determine how top displays its information. The command line options given in Table 2-3 affect the type and frequency of performance statistics displayed by top.
Insert image description here

When you run top, you may want to make slight adjustments to your observations in order to investigate specific issues. Top output is highly customizable. The options given in Table 2-4 can modify the statistics displayed during top running:
Insert image description here

The options given in Table 2-5 turn on or off the display of various system-level information. Turning off unnecessary statistics helps keep more processes on the screen.
Insert image description here

Table 2-6 describes the different sorting modes supported by top. Sorting by memory consumption is especially useful to find out which process consumes the most memory.
Insert image description here

In addition to providing information about specific processes, top also provides overall system information. Table 2-7 gives these statistics.
Insert image description here
Insert image description here

top provides a wealth of information about the different running processes and is an excellent way to identify resource hogs.

2.2.2.2 Usage examples

Listing 2.6 is an example of running top. When it starts, it will update the screen periodically until it exits. This example shows some overall system statistics that top can generate. first. We can see the system load average for 1 minute, 5 minutes and 15 minutes. It can be seen that the system has become busy (because of doom-3.x86). A CPU spends 90% of its time in user code. Another only spent about 13% of its time in user code. Finally, we see 73 processes sleeping and only 3 processes running.
Insert image description here
Insert image description here
Insert image description here
Insert image description here

Now, press the F key while top is running to pop up the configuration interface, as shown in Listing 2.7. When you press the representative key (A for PID, B for PPID, etc.), top will toggle the display of these statistics on the screen. After selecting all the statistical information you need, press the Enter key to return to the initial interface of top, which now displays the current value of the selected statistical information. When configuring statistics, all currently selected fields will appear in uppercase on the Current Field Order line, with an asterisk (*) next to their name.

Actual measurement: Press F to enter the interactive settings, the space is used to select and deselect, the up and down arrows are used to select different items, and Esc is used to confirm & return to the top display page.
Insert image description here
Insert image description here

To demonstrate the customizability of top, Listing 2.8 gives a highly configurable output interface that only displays top options related to CPU usage:
Insert image description here
Insert image description here
Insert image description here

top provides an overview of system resource usage, focusing on how various processes consume these resources. Because its output format is user-friendly and tool-unfriendly, it is best used when interacting directly with the system.

2.2.3 top(version 3.xx)

Recently, the top provided in the latest version has been completely changed, and as a result, many command line and interactive options have changed. Although the basic idea is similar, top has been streamlined and several different display modes have been added. Likewise, top is presented as a descending list, with the process that consumes the most CPU at the top.

2.2.3.1CPU performance related options

Use the following command line to call top:top [-d delay] [-n iter] [-i] [-b]. top actually has two modes of options: command line options and runtime options. Command line options determine how top displays its information. The command line options given in Table 2-8 affect the type and frequency of performance statistics displayed by top.
Insert image description here

When running top, you may want to slightly adjust your observations in order to investigate a specific problem. Like the top 2.x version, its output is highly customizable. Table 2-9 gives options that modify the statistics displayed while top is running.
Insert image description here

The options given in Table 2-10 turn on or off the display of various system-level information. Turning off unnecessary statistics helps keep more processes on the screen.
Insert image description here

Like topv2.x, top v3.x provides overall system information in addition to information about specific processes. Table 2-11 gives these statistics.
Insert image description here
Insert image description here
top provides a wealth of information about the different running processes and is an excellent way to identify resource hogs. top v.3 simplifies top and adds some different views of the same data.

2.2.3.2 Usage examples

Listing 2.9 is an example of running top v3.0. Likewise, it will update the screen periodically until it exits. Its statistics are the same as top v2.x, but the name has changed slightly.
Insert image description here
Insert image description here
Insert image description here

Now, press the f key while top is running to bring up the configuration interface, as shown in Listing 2.10. When you press the representative key (A for PID, B for PPID, etc.), top will switch the display of these statistics on the screen. After selecting all the statistical information you need, press the Enter key to return to the initial interface of top, which now displays the current value of the selected statistical information. When configuring statistics, all currently selected fields will appear in uppercase on the Current Field Order line and have an asterisk (*) next to their name. Note that most of the statistics are the same, but the names have changed slightly.
Insert image description here
Insert image description here

Listing 2.11 shows the new top output mode, where many different statistics are categorized and displayed on the same screen.
Insert image description here
Insert image description here
top v3.x provides a slightly simpler interface for top. It simplifies some aspects of top and provides a nice "summary" information screen showing the many resource consumers in the system.

2.2.4 procinfo (display information from /proc file system)

Like vmstat, procinfo also provides an overview of the overall information characteristics of the system. Although it provides some of the same information as vmstat, it also gives the number of interrupts the CPU received from each device. Its output format is slightly more readable than vmstat, but it takes up more screen space.

2.2.4.1CPU performance related options

The calling command line of procinfo is as follows: procinfo [-f] [-d] [-D] [-n sec] [·f file]. Table 2-12 describes the different options for modifying the output and frequency of procinfo display samples.
Insert image description here

Table 2-13 shows the CPU statistics collected by procinfo.
Insert image description here

Like vmstat and top, procinfo is a low-overhead command suitable for running on its own from a console or screen window. It reflects well on the health and performance of the system.

2.2.4.2 Usage examples

Calling procinfo without any command options produces the output shown in Listing 2.12. Without parameters, procinfo will only display one screen of status information and exit. Use the -n second option to update procinfo periodically, which will be more useful. This allows you to see real-time changes in system performance.
Insert image description here
Insert image description here
Insert image description here
Insert image description here

As you can see in Listing 2.12, procinfo provides a nice overview of the system. Looking at user, nice, system and idle time, we again find that the system is not very busy. An interesting phenomenon to note is that procinfo indicates that the system has spent more time idle than it has been running (expressed as uptime). This is because the system actually has 4 CPUs, so for one day of wall clock time, four days of CPU time have elapsed. The load average proves that the system has had relatively little recent work. Over time, on average, the system had less than one process ready to run; a load average of 0.47 means a single process was ready to run only 47% of the time. For a system with four CPUs, a lot of CPU power will be wasted.

procinfo also gives us a good view of which device in the system is causing the interrupt. It can be seen that the number of interrupts for the graphics card (nvidia), hard disk controller (ide0), Ethernet device (eth0), and sound card (es1371) is relatively high. These situations typically occur on desktop workstations. The advantage of procinfo is that it puts many system-level performance statistics on one screen, allowing you to understand the overall performance of the system. It lacks detailed information on network and disk performance, but provides good details on statistics for CPU and memory performance. One limitation that may be important is that procinfo does not report when the CPU is in iowait, irq, or softirq mode.

2.2.5 gnome-system-monitor

gnome-system-monitor can be said to be the graphical version of top in many aspects. It enables you to graphically monitor individual processes and observe system load based on displayed charts.

2.2.5.1CPU performance related options

gnome-system-monitor can be called from the Gnome menu. (In Red Hat 9 and above, select the menu System Tools → System Monitor.) However, it can also be called with the following command: gnome-system-monitor. gnome-system-monitor has no relevant command line options to affect CPU performance measurements. However, some of the displayed statistics can be modified by selecting the Edit → Preferences menu item of gnome-system-monitor.

2.2.5.2 Usage examples

When you start gnome-system-monitor, it creates a window similar to Figure 2-1. This window displays information about the total amount of CPU and memory used by a specific process. It also displays information about parent/child relationships between processes. Figure 2-2 shows a graphical view of system load and memory usage. From this point we can really distinguish gnome-system-monitor from top. You can easily view the current state of your system and compare it to its previous state. The graphical view of data provided by gnome-system-monitor makes it easier and faster to determine system status and changes in its behavior over time. It also makes it easier to browse system-level process information.
Insert image description here
Insert image description here
![Insert image description here](https://img-blog.csdnimg.cn/9ed6dd495a8343a8904c8e06fd8f5a82.png
Insert image description here

2.2.6.mpstat (multiprocessor statistics)

mpstat is a fairly simple command that shows you CPU behavior over time. The biggest advantage of mpstat is that it displays the time next to the statistics, so you can find out the relationship between CPU usage and time. If you have multiple CPUs or hyper-threaded CPUs, mpstat can also break down CPU usage by processor, so you can find out if one processor is doing more work than others. You can select a single processor you want to monitor, or you can ask mpstat to monitor all processors.

2.2.6.1CPU performance related options

mpstat can be called with the following command line: mpstat [-P { cpu | ALL } ] [delay [count]]. As before, delay specifies the sampling interval and count specifies the number of samples. Table 2-14 explains the meaning of the mpstat command line options.
Insert image description here

mpstat provides similar information to other CPU performance tools, however, it allows the information to be categorized by individual processors in a specific system. Table 2-15 shows the options supported by mpstat.
Insert image description here

mpstat is a great tool that provides a breakdown of the performance of each processor. Since mpstat gives a per-CPU breakdown, you can identify if any processor is becoming overloaded.

2.2.6.2 Usage examples

First, we ask mpstat to display statistics for the CPU with processor number 0, as shown in Listing 2.13.
Insert image description here
Insert image description here
Listing 2.14 shows the results of using the same command on a typical unloaded hyper-threaded CPU. You can see all CPU statistics displayed. An interesting thing in the output is that one of the CPUs seems to handle all the interrupts. If the system has a heavy I/O load and all interrupts are handled by one processor, then this may be a bottleneck because one CPU is overloaded while the other CPUs are waiting. If one CPU is busy handling all interrupts and has no time to spare, and at the same time, other processors are idle, you can find this situation with mpstat.
Insert image description here
Insert image description here
Insert image description here

mpstat can be used to determine whether the CPU is fully utilized and whether the usage is relatively balanced. By looking at the number of interrupts each CPU handles, it's possible to spot imbalances. For details on how to control interrupt routing, see the kernel source code under Documentation/IRQ-affinity.txt.

2.2.7 sar (system activity report)

sar uses another method to collect system data. sar effectively logs collected system performance data to binary files, which can later be replayed. sar is a low-overhead method of recording system execution information. The sar command can be used to record performance information, play back previously recorded information, and display real-time information of the current system. The output of the sar command can be formatted to make it easy to import into a database or fed to other Linux commands for processing.

2.2.7.1CPU performance related options

sar can be called using the following command line: sar [options] [delay [count]]. Although sar's reports cover many different areas of Linux, its statistics come in two different forms. A set of statistics is an instantaneous value at the time of sampling. The other set is the change since the last sample. Table 2-16 explains the sar command line options.
Insert image description here
The set of system-level CPU performance statistics provided by sar is similar to what we see in the process tool (with a different name). As shown in Table 2-17.
Insert image description here
One of the most significant advantages of sar is that it enables you to save different types of time-stamped system data to log files for later retrieval and review. This feature proves to be very handy when trying to find out why a particular machine failed at a particular time.

2.2.7.2 Usage examples

The first command shown in Listing 2.15 requires three CPU samples per second, with the results saved to the binary file /tmp/apache_test. This command does not have any visual output and returns upon completion.
Insert image description here
Insert image description here

After the information is saved to the /tmp/apache_test file, we can display it in various formats. The default format is human-readable, as shown in Listing 2.16. The listing shows similar information to other system monitoring commands, where we can see how the processor is consuming its time at specific times.
Insert image description here

Insert image description here
However, sar can also output statistics into a format that can be easily imported into a relational database, as shown in Listing 2.17. This helps save large amounts of performance data. Once imported into a relational database, this performance data can be analyzed using all relational database tools.
Insert image description here
Insert image description here
Finally, sar has a statistics output format that is easily parsed by standard Linux tools such as awk, perl, python or grep. As shown in Listing 2.18, this output can be fed into a script that triggers interesting events and even potentially analyzes the data for different trends.
Insert image description here
Insert image description here

Insert image description here
In addition to logging information to files, sar can also be used for real-time system observation. In the example shown in Listing 2.19, the CPU state is sampled three times with one second interval between samples.
Insert image description here

Insert image description here
The purpose of the default display is to display CPU information, but other information can also be displayed. For example, sar can display the number of context switches per second and the number of memory pages swapped. In Listing 2.20, sar samples the information twice, with an interval of one second. This time, we ask sar to display the number of context switches per second and the number of processes created. We also asked sar to give information about load average. It can be seen that the machine in this example has 163 processes in the memory, but none of them are running. There were an average of 1.12 processes waiting to run in the past minute.
Insert image description here
Insert image description here
Insert image description here
As you can see, sar is a powerful tool capable of recording many different performance statistics. It provides a Linux-friendly interface that allows you to easily extract and analyze performance data.

2.2.8.oprofile

oprofile is a performance toolkit that uses performance counters found on almost all modern processors to track the consumption of CPU time in the system as a whole and in individual processes. In addition to measuring where CPU cycles are being consumed, oprofile can also measure very low-level information about CPU execution. Depending on the events supported by the underlying processor, it can measure cache misses, branch mispredictions and memory references, as well as floating point operations. oprofile does not record every event that occurs. Instead, it works with the processor performance hardware to sample every count event, where count is a numerical value specified by the user when starting oprofile. The lower the value of count, the higher the accuracy of the result, and the greater the overhead of oprofile. If count is kept at a reasonable value, oprofile not only has very low running overhead, but can also describe system performance with surprising accuracy. Sampling is very powerful, but there are some subtle pitfalls to be aware of when using it.

First, sampling may show that you spend 90% of your time on a particular routine, but it won't show why. There are two possible reasons why a particular routine consumes a large number of cycles. For one, this routine can be a bottleneck and its execution takes a lot of time. However, it is also possible that the execution time of the routine is reasonable, but the number of times it is called is very high. There are usually two ways to find out which is the case: by looking at the sample to find particularly popular lines, or by writing code to count the number of times a routine is called. The second problem with sampling is that you can never be quite sure where a function is called from. Even if you have figured out that it is called many times and have traced all the functions that call it, it is not necessarily clear which function among them completes the vast majority of calls.

2.2.8.1.CPU performance related options

oprofile is actually a set of components that work together to collect CPU performance statistics. oprofile has three main parts:
1.oprofile core module controls the processor and allows and disables sampling.
2. The oprofile background module collects samples and saves them to disk.
3. The oprofile reporting tool takes the collected samples and shows the user their relationship to the applications running on the system.

The oprofile toolkit hides driver and background operations within opcontrol commands. The opcontrol command is used to select events sampled by the processor and start sampling. When performing background control, you can use the following command line to call opcontrol: opcontrol [-start] [-stop] [–dump]. The control of this option (Profiling Daemon) enables you to start and stop sampling and import samples from the daemon's memory to disk. When sampling, the oprofile background module saves a large number of samples in an internal buffer. However, it can only analyze those samples that have been written (or imported) to disk. Writing to disk can be expensive, so oprofile only performs this operation periodically. As a result, after running a test and analyzing it with oprofile, you may not get the results immediately and you need to wait until the buffer is written to disk in the background. This can be frustrating when you want to start profiling immediately, so the opcontrol command allows you to disable the importing of samples from the oprofile backend's internal buffer to disk. This will allow you to start performance investigation immediately after the test is completed.

Table 2-18 describes the opcontrol program options that enable you to control background operations.
Insert image description here

By default, oprofile selects an event at a given frequency that is reasonable for the processor and core you are running on. However, there are more events that can be monitored than the default events.

When you have listed and selected an event, opcontrol will be called with the following command line: opcontrol [-list-events] [-event=:name:count:unitmask:kernel:user:]

Event descriptions allow you to choose which event to sample, how often that event is sampled, and whether sampling occurs in kernel space, user space, or both. Table 2-19 describes the opcontrol command line options that allow you to select different events for sampling.
Insert image description here
After collecting and saving samples, oprofile provides a different tool, opreport, which allows you to view the collected samples. The calling command line of opreport is as follows: opreport [-r] [-t] Generally, opreport displays the samples collected by all systems and which executable programs caused these samples (including the kernel). The executable thread with the largest number of samples is ranked first, followed by all executable threads with samples.

In a typical system, at the top of the list are the few executable threads that have the majority of samples, while the large number of executable threads contribute only a small number of samples. For this situation, opreport allows you to set a threshold, and only executable threads whose sample number percentage meets or exceeds the threshold can be displayed. At the same time, opreport can also reverse the display order of executable threads, and those with the largest number of samples will be displayed last. This way, the most important data is shown last so it doesn't scroll across the screen. Table 2-20 describes the opreport command-line options that allow you to customize the format of the sampled output.
Insert image description here
Again, oprofile is a complex tool, and the options given are only the basic functions of oprofile. In subsequent chapters, you will learn more features of oprofile.

2.2.8.2 Usage examples

oprofile is a very powerful tool, but its installation is a bit difficult. Appendix B guides readers on how to install and run oprofile on several major Linux distributions. To use oprofile, you must first set it up according to the analysis. The first command, shown in Listing 2.21, uses the opcontrol command to tell the oprofile toolkit where an uncompressed kernel image is located. oprofile needs to know the location of this file so that it can assign samples to exact functions in the kernel.
Insert image description here
After setting the path to the current kernel, we can start analysis. The command in Listing 2.22 tells oprofile to start sampling with the default event. This event varies depending on the processor. For the current processor, this event is CPU CLK_UNHALTED. This event will sample all CPU cycles as long as the processor is not stalled. 233869 means that every 233869 events will sample the instructions being executed by the processor.
Insert image description here
Now that sampling has begun, we want to analyze the sampling results. In Listing 2.23, we use the reporting tool to find out what is happening in the system. opreport reports what has been analyzed so far.
Insert image description here
Although the analysis has been ongoing for a short period of time, we stop when opreport indicates that it cannot find the sample. This happens because the opreport command looks for samples on disk, while the oprofile daemon stores samples in memory and dumps them to disk periodically. When we request the sample list from opreport, it cannot find it on disk and therefore reports that no samples were found. In order to alleviate this problem, we can force the background program to dump samples immediately by adding the dump option to opcontrol. As shown in Listing 2.24, this command allows us to view the collected samples.
Insert image description here
After dumping the sample to disk, we try again to ask oprofile to give a report, as shown in Listing 2.25. This time, we got the results. The report contains information about the processor from which the sample was collected, as well as information about the types of events it monitors. The report then sorts the number of event occurrences in descending order and lists in which executable they occurred. We can see that the Linux kernel accounts for 50% of the total clock, emacs 14%, and libc 12%. You can dig deeper into the executable to determine which function takes up all the time, which we discuss in Chapter 4.
Insert image description here
When we started oprofile, we just used the default events that opcontrol selected for us. Each processor has a very rich set of events that can be monitored. In Listing 2.26, we ask opcontrol to list all events available to a specific CPU. This list is quite long, but in it we can see that in addition to CPU_CLK UNHALTED, we can also monitor DATA_MEM_REFS and DCU_LINES_IN. These are storage events caused by the memory subsystem, and we will discuss them in subsequent chapters.
Insert image description here
The command that needs to specify the monitored event seems a bit cumbersome. Fortunately, we can also use oprofile's graphical command oprof_start to start and stop sampling graphically. This allows us to graphically select the events we want without having to figure out the exact way to specify the events we want to monitor on the command line. In the op_control example shown in Figure 2-3, we tell oprofile that we want to monitor both DATA_MEM_REFS and L2_LD events. The DATA_MEM_REFS event can tell us which applications use a large amount of the memory subsystem and which use the L2 cache. Specific to this processor, its hardware only has two counters available for sampling, so only two events can be used at the same time. After collecting samples using oprofile's graphical interface, we can now analyze the data. As shown in Listing 2.27, we ask opreport to display analysis of the samples it collects, in a form similar to that used when monitoring the cycle. In this example, we can find that the libmad library accounts for 31% of the data memory accesses in the entire system, making it the largest user of the memory subsystem.
Insert image description here
Insert image description here
The output provided by opreport shows all system libraries and executable programs that contain any sampled events. Please note that not all events are logged; this is because we are sampling and only a subset of events will actually be logged. Usually this is not a problem because if a particular library or executable is a performance issue, then it will most likely cause a costly event to occur many times. If sampling is random, these high costs mean that events will eventually be captured by the sampling code as well.

2.3 Summary of this chapter

This chapter focuses on system-level performance metrics regarding CPU usage. These metrics primarily show how the operating system and machine are running, rather than a specific application. This chapter shows how to use performance tools, such as sar and vmstat, to extract system-level performance information from a running system. These tools are the first line of defense when diagnosing system problems. They help determine how the system is performing and which subsystems or applications may be under stress. The next chapter will focus on system-level performance tools that can analyze the overall memory usage of the system.

Guess you like

Origin blog.csdn.net/x13262608581/article/details/133419579