linux performance evaluation ---- CPU performance evaluation tool vmstat / sar / iostat / uptime

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/wjandy0211/article/details/90608463

Measure of CPU performance:
1, where the user uses the CPU;
        CPU running regular user processes
        CPU Process run niced
        CPU to run real-time processes

2, the case of the system using a CPU;
        for I / O management: an interrupt and a drive
        for memory management: page swapping
        user process management: Start Process and context switching

3, WIO: process to wait for disk I / O ratio of the CPU is idle.

4, CPU idle ratio, in addition to the above idle time WIO

5, CPU context switching for a ratio of

6,nice

7,real-time

8, the length of the run queue process

9, the average load

Linux commonly used to monitor the overall CPU performance tools:
§ mpstat: mpstat not only be able to view information about all the average CPU, but also to view specific information CPU.

§ vmstat: on average only view information about all of the CPU; cpu queue to view the information;

§ iostat: you can only view information about all the average CPU.

§ sar: mpstat with the same information can not only view the average CPU, but also to view specific information CPU.

§ top / htop: ps with information displayed close, but top CPU consumption can understand, you can update the display based on user-specified time.

 

A, mpstat
mpstat is Multiprocessor Statistics acronym is real-time system monitoring tools. Some statistics that report with the CPU, the information stored in the / proc / stat file. Multi-CPUs in the system, it will not only be able to view status information for all the average CPU, but also be able to view information about a specific CPU. The following describes only mpstat parameters associated with the CPU, mpstat syntax is as follows:

mpstat [-P {|ALL}] [internal [count]]

Meaning of the parameters are as follows:

Parameter Description

-P {| ALL} which represents a monitoring CPU, cpu values ​​in [0, cpu number -1] in

two adjacent internal sampling interval

Count the number of samples, count and delay can only be used with

When no parameters, mpstat the average of all the information displayed after the system starts. When interval, the average information since the first line of information from system startup. Starting from the second row, the average information of the previous output period interval. Meaning the output of the CPU-related as follows:

Of parameters to obtain data from / proc / stat

CPU processor ID

time period in the internal user, CPU time, the user state (%), the process does not include a negative value nice dusr / dtotal * 100

In the internal time period nice, nice value is negative, the process of CPU time (%) dnice / dtotal * 100

time period in the internal system, the core time (%) dsystem / dtotal * 100

iowait the internal time period, hard disk IO wait time (%) diowait / dtotal * 100

irq in the internal time period, the soft interrupt time (%) dirq / dtotal * 100

In the time period, internal soft, soft interrupt time (%) dsoftirq / dtotal * 100

In idle internal time period, CPU waits for a disk IO removed for any reason the operation of the idle time idle time (%) didle / dtotal * 100

intr / s in the internal time period, number of times per second dintr received interrupt to the CPU / dtotal * 100

Total CPU operating time = total_cur = user + system + nice + idle + iowait + irq + softirq

total_pre = pre_user pre_system + + + pre_nice pre_idle pre_iowait + + + pre_irq pre_softirq

duser=user_cur – user_pre

dtotal=total_cur-total_pre

Wherein _cur represents the current value, _pre represents a value before the time interval. All values ​​in the table desirable to two decimal places.

-P 2 ALL 10 #mpstat
the Linux 2.6.18-53.el5PAE (localhost.localdomain) 03/28/2009
 
10:07:57 the PM the CPU Nice%%% SYS User iowait%%% Soft IRQ INTR% Steal% IDLE / S
10:07:59 All the PM 10.50 20.75 0.00 1.50 0.25 0.25 0.00 66.75 1294.50
10:07:59 the PM 0 0.00 9.00 1.50 0.00 0.00 16.00 0.00 73.50 1000.50
10:07:59 the PM. 1 25.76 0.00 1.52 12.12 0.00 0.51 0.00 60.10 294.00
two , vmstat
 (refresh once every three seconds)

[root@localhost ~]#vmstat -n 3      
procs-----------memory--------------------swap-- ----io---- --system---- ------cpu--------
r b   swpd   free       buff       cache       si   so    bi    bo   in      cs        us   sy   id  wa
10    144 186164 105252 2386848    0    0     18   166  83     2          48   21  31  0
20    144 189620 105252 2386848    0    0      0   177  1039 1210   34   10  56  0
00    144 214324 105252 2386848    0    0      0    10   1071   670    32   5    63  0
00    144 202212 105252 2386848    0    0      0   189   1035   558    20   3    77  0
20    144 158772 105252 2386848    0    0      0   203  1065 2832    70  14  15  0
PROC(ESSES)
--r: If the processes running in the sequence (Process r) is continuously greater than the CPU in the system represents the number of the system is now running slow, the process waits for a majority of the CPU.
If r is greater than the number of output systems is available the number of CPU four times, then the system faces a shortage of CPU problem, or speed of the CPU is too low, the system has the majority of processes waiting for CPU, causing the system processes running too slow.
the sYSTEM
--in: per second the number of interrupts generated
--cs: context switching times per second to produce
the larger the above two values, the greater will see CPU time consumed by the kernel will
 
CPU
-us: user process the percentage of CPU time consumed
value comparison us when high, indicating that the user process consumes CPU time and more, but if long-term use of over 50%, then we should consider accelerating the optimization algorithm or program (such as PHP / PERL)
-SY: the percentage of CPU time consumed by the kernel process (sy when a high value is, the more CPU resources consumed by the system kernel, this is not a benign manifestation, we should examine the reasons)
-Wa: IO wait CPU time consumed by the percentage of
high value wa, indicating more serious IO wait, this may be due to magnetic As a result a large number of random access, there may be a disk bottleneck (block operation).
-id: CPU in the percentage of idle time if the idle time (cpu id) continued to zero and the system time (cpu sy) is twice the user time (cpu us) systems are faced with a shortage of CPU resources.

 Solution:
Please adjust the above problem occurs when an application for occupancy of the CPU allows the application to more efficient use of CPU and can consider adding more CPU on CPU usage can also be combined mpstat, ps... aux top prstat -a so some of the appropriate commands to consider with regard to specific CPU usage, and those processes consume a lot of CPU time. under normal circumstances, the question of the application will be relatively large number, such as some SQL statements do not reasonable so will cause such a phenomenon.

Three, the iostat
# iostat 2 -C 10
the Linux 2.6.18-53.el5PAE (localhost.localdomain) 03/28/2009
AVG-CPU:% System User% Nice%%% iowait IDLE Steal%
                    30.10 0.00 4.89 5.63 59.38 0.00
AVG -cpu: the User%%% System% Nice iowait% IDLE Steal%
                    8.46 0.00 1.74 0.25 0.00 89.55
AVG-the CPU: the User%%% System% Nice iowait% IDLE Steal%
                    22.06 0.00 11.28 1.25 0.00 65.41
four, sar
sar [Options] [-A] [-o file] t [n]
 in the command line, n and t together two parameters define the number and sampling interval, the sampling interval t, it is to have
a parameter, n-sampling frequency is selected, the default value is 1, -o file represents the command results in binary format
stored in the file, file in here is not a keyword, is the file name. options for the command-line options, sar command
Many options, only common options are listed below:

-A: the sum of all reports.
-u: CPU utilization
-v: process, I nodes, files, and lock table state.
-d: hard disk usage reports.
-r: use of memory and swap space statistics.
-g: the case of the serial I / O's.
-b: buffer usage.
-a: file read and write conditions.
-c: system call situation.
-q: Report average queue length and system load
-R: process of their activities.
-y: terminal equipment activities.
-w: system activity.
-x {pid | SELF | ALL} : Specifies the process ID of the report statistics, SELF keyword statistics sar process itself, ALL keyword statistics for all system processes.

CPU utilization was analyzed by sar

#sar -u 2 10
Linux 2.6.18-53.el5PAE (localhost.localdomain)  03/28/2009
07:40:17 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
07:40:19 PM       all         12.44      0.00         6.97          1.74         0.00        78.86
07:40:21 PM       all         26.75      0.00        12.50         16.00       0.00        44.75
07:40:23 PM       all         16.96      0.00         7.98          0.00         0.00        75.06
07:40:25 PM       all         22.50      0.00         7.00          3.25         0.00        67.25
07:40:27 PM       all         7.25        0.00         2.75          2.50         0.00        87.50
All the PM 20.05 0.00 8.56 07:40:29 2.93 0.00 68.46
07:40:31 All the PM 0.00 6.23 3.49 0.00 13.97 76.31
07:40:33 All the PM 8.25 0.00 0.75 3.50 0.00 87.50
07:40:35 13.25 0.00 5.75 4.00 All the PM 77.00 0.00
07:40:37 PM All 10.03 0.00 0.50 2.51 0.00 86.97
Average: 15.15 0.00 5.91 3.99 0.00 All 74.95
in display includes: runq-sz preparation processes running run queue.

  % user: CPU the percentage of time the user mode.
      % nice: CPU the percentage of time the user mode with NICE value.
  % system: CPU the percentage of time the system mode.
  % iowait: CPU waits for input and output completion percentage.
      % steal: the management of the program to maintain another virtual processor, a virtual CPU waits for the percentage of time unconscious.
  % idle: the percentage of CPU idle time.
        In all indications, we should mainly pay attention to% iowait and% idle,% iowait value is too high, indicate the presence of a hard disk I / O bottlenecks,% idle value is high, said more CPU is idle, if the% idle value is high but the system response slow when there may be waiting for allocation of CPU memory, it should increase the memory capacity. % idle Sustained value if less than 10, then the relatively low CPU processing capacity of the system, indicating that the system resources need to be resolved is the most CPU.
Be run queue length process analysis sar:

#sar -q 2 10
Linux 2.6.18-53.el5PAE (localhost.localdomain) 03/28/2009
7:58:14 PM runq-SZ-SZ plist ldavg-1 ldavg ldavg-5-15
7:58:16 PM 493 0.64 0.56 0 0.49
7:58:18 PM 491 0.64 0.56 1 0.49
7:58:20 PM 488 0.59 0.55 1 0.49
7:58:22 PM 487 0.59 0.55 0 0.49
7:58:24 PM 0 485 0.59 0.55 0.49
7:58:26 PM 483 0.78 0.59 1 0.50
7:58:28 PM 481 0.78 0.59 0 0.50
7:58:30 PM 1 480 0.72 0.58 0.50
PM 0 0.72 0.58 477 07:58:32 0.50
07:58:34 PM 0 474 0.72 0.58 0.50
Average: 0 484 0.68 0.57 0.49
Number of plist-sz process queue of processes and threads
ldavg-1 before the one-minute average system load (load average)
system load average ldavg-5 five minutes (load average)
average system load (load average) 15 minutes ldavg-15 prior to
 
the way load avarage meaning
load average wait can be understood as per CPU the number of processes running.
in the Linux system, sar -q, uptime, w, top commands will have output system load average load average, then what is the average load system?
  Load average is defined as the average number of tasks in the queue operating within a particular time interval. If a process that would meet the following criteria are positioned in the run queue:
  - it is not waiting for the results of I / O operations
  - it is not actively enter the wait state (i.e. no calls 'the wait')
  - is not stopped (e.g.: Wait termination)
  For example:

Uptime #
  20:55:40 up 24-Days, 3:06, 1 the User, the Load Average: 8.13, 5.90, 4.94
and finally the contents of the command output represents the average number of processes running in the queue in the last 5, 15 minutes.
  In general as long as the number of currently active processes per CPU is not more than 3 so that the good performance of the system, if the number of tasks per CPU is greater than 5, it means that the performance of this machine have serious problems. For the above example, assuming there are two CPU system, then the current number of tasks for each CPU: 8.13 / 2 = 4.065. This means that the performance of the system is acceptable.
 

Guess you like

Origin blog.csdn.net/wjandy0211/article/details/90608463