Detailed explanation of iostat command for Linux performance analysis

Detailed explanation of iostat command for Linux performance analysis

The iostat command is a common tool for IO performance analysis, which is the abbreviation of input/output statistics. This article will focus on the following aspects to introduce the iostat command:

  • Installation of iostat
  • iostat command line option description
  • iostat output content analysis
  • How to determine disk IO bottlenecks
  • iostat actual case

Installation of commands

iostat is located in the sysstat package and can be installed using yum.

yum install sysstat -y

iostat command line option description

The basic format of the iostat command is as follows:

iostat <options> <device name>
  • -c: Display CPU usage
  • -d: Display disk usage
  • –dec={ 0 | 1 | 2 }: Specify the number of decimal places to use, the default is 2
  • -g GROUP_NAME { DEVICE […] | ALL } Display statistics for a group of devices
  • -H This option must be used with option -g to indicate that only global statistics for the group are displayed and not statistics for individual devices in the group
  • -h print size in human readable format
  • -j { ID | LABEL | PATH | UUID | … } [ DEVICE […] | ALL ] Display the permanent device name. Options ID, LABEL, etc. are used to specify the type of persistent name
  • -k displays in KB
  • -m displays in MB
  • -N displays disk array (LVM) information
  • -n displays NFS usage
  • -p [ { DEVICE [,…] | ALL } ] displays disk and partition status
  • -t print timestamp. The timestamp format may depend on the S_TIME_FORMAT environment variable
  • -V displays version information and exits
  • -x show detailed information (display some extended column data)
  • -y If multiple records are displayed within a given time interval, ignore the first statistic since system startup
  • -z omit output from any device that was not active during the sampling period

Common command line uses are as follows:

iostat -d -k 1 10         #查看TPS和吞吐量信息(磁盘读写速度单位为KB),每1s收集1次数据,共收集10次
iostat -d -m 2            #查看TPS和吞吐量信息(磁盘读写速度单位为MB),每2s收集1次数据
iostat -d -x -k 1 10      #查看设备使用率(%util)、响应时间(await)等详细数据, 每1s收集1次数据,总共收集10次 
iostat -c 1 10            #查看cpu状态,每1s收集1次数据,总共收集10次

iostat output content analysis

Enter iostat on the Linux command line, and the following output will usually appear:

[root@localhost ~]# iostat
Linux 5.14.0-284.11.1.el9_2.x86_64 (localhost.localdomain)      08/07/2023      _x86_64_        (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.31    0.01    0.44    0.02    0.00   99.22

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
dm-0              3.19        72.63        35.90         0.00     202007      99835          0
dm-1              0.04         0.84         0.00         0.00       2348          0          0
nvme0n1           3.36        93.22        36.64         0.00     259264     101903          0
sr0               0.02         0.75         0.00         0.00       2096          0          0

First line:

Linux 5.14.0-284.11.1.el9_2.x86_64 (localhost.localdomain)      08/07/2023      _x86_64_        (4 CPU)

Linux 5.14.0-284.11.1.el9_2.x86_64 is the kernel version number, localhost.localdomain is the name of the host, 08/07/2023The current date, _x86_64_ is the CPU Architecture, (4 CPU) shows the number of CPUs in the current system.

Next, look at the second part. This part is CPU-related information. In fact, it is similar to the output of the top command.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.31    0.01    0.44    0.02    0.00   99.22

cpu attribute value description:

  • %user: The percentage of time the CPU is in user mode.
  • %nice: The percentage of time the CPU is in user mode with a NICE value.
  • %system: The percentage of time the CPU is in system mode.
  • %iowait: The percentage of time the CPU waits for input and output to complete.
  • %steal: The percentage of unintentional waiting time of the virtual CPU while the hypervisor maintains another virtual processor.
  • %idle: CPU idle time percentage.

Here is an explanation of the iowait indicator. This indicator is easily misunderstood.

First of all, we must have the concept that %iowait is a subset of %idle, and its calculation method is as follows:

If the CPU is in idle state at this time, the kernel will do the following checks

  • 1. Is there an outstanding local disk IO request initiated from this CPU?
  • 2. Is there a network disk mounting operation initiated from this CPU?

If any of the above conditions exists, the counter of iowait is incremented by 1. If there is none, the counter of idle is incremented by 1.

For example, if the interval is 1s, there are 100 clocks in total. If the sys count is 2, the user count is 3, the ncie count is 0, the iowait count is 1, the steal count is 0, and the idle count is 94, then their percentages are as follows : 2%, %3, 0%, 1%, 0%, 94%.

After knowing the calculation method of iowait, let’s explain some common misunderstandings about iowait:

  • iowait means waiting for IO to be completed, during which the CPU cannot accept other tasks
    From the definition of iowait above, we can know that iowait means that the CPU is idle and has unfinished disk IO requests. , that is to say, the first condition of iowait is that the CPU is idle. Since it can accept tasks when it is idle, it is idle only when there are currently no runnable tasks. Why is there no runnable task? It may be waiting for some events, such as: disk IO, keyboard input, or waiting for network data, etc.

  • High iowait indicates that there is a bottleneck in IO
    Since the Linux documentation does not explain much about iowait, it is easy to misunderstand. The first condition of iowait is that the CPU is idle, that is, all The processes are all sleeping, and the second condition is that there are outstanding IO requests. Putting these two conditions together can easily lead to the following understanding: the reason why the process sleeps is to wait for the IO request to be completed, and when %iowait becomes high, it means that the process sleeps for a longer time because of waiting for IO, or it sleeps because of waiting for IO. The number of processes has increased. At first glance, it seems reasonable, but it is actually wrong.

Increasing iowait does not necessarily lead to an increase in the number of processes waiting for IO, nor does it necessarily lead to a longer waiting time for IO. We use the following figure to understand:

iowait

Among the changes in these three pictures, IO has not changed at all. Only the idle time of the CPU has changed, and the value of iowait has changed greatly. Therefore, it cannot be judged that there is a bottleneck in IO based on %iowait alone.

Now go back to iostat and look at the output results of the third part.

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
dm-0              3.19        72.63        35.90         0.00     202007      99835          0
dm-1              0.04         0.84         0.00         0.00       2348          0          0
nvme0n1           3.36        93.22        36.64         0.00     259264     101903          0
sr0               0.02         0.75         0.00         0.00       2096          0          0

The meaning of each column is as follows:

  • Device: The disk (or partition) name under the /dev directory
  • tps: The number of transmissions per second of the device. A transfer is an I/O request, and multiple logical requests may be combined into one I/O request. The size of a transfer request is unknown
  • kB_read/s: size of data read from disk per second, unit KB/s
  • kB_wrtn/s: Size of data written to disk per second, unit KB/s
  • kB_dscd/s: The number of disk blocks lost per second, odd number KB/s
  • kB_read: The total amount of data read from the disk, in KB
  • kB_wrtn: The total amount of data written to the disk, in KB
  • kB_dscd: The total number of lost blocks on the disk

It should be noted that if you use a command like iostat -dk 2 to collect data every 2s, the meaning of kB_wrtn is within 2s. 2s, and the meaning of kB_dscd is It is the number of disk blocks lost within2sThe total number of data written to the disk, and the meaning of kB_read is the total number of data read from the disk within

If there is no time interval parameter, such asiostat -dk, then the meaning of kB_wrtn is the writing since boot The total amount of data written into the disk, kB_read means the total amount of data read out from the disk since was turned on, and the meaning of kB_dscd is The number of lost disk blocks since booting.

In addition, iostat can use-x to output some extended columns, such as the following output:

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
dm-0             2.16     47.69     0.00   0.00    0.56    22.06    0.88     25.70     0.00   0.00    2.66    29.16    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.25
dm-1             0.02      0.49     0.00   0.00    0.30    23.72    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00
nvme0n1          2.28     59.76     0.01   0.38    0.54    26.18    0.74     26.13     0.15  17.11    1.89    35.12    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.26
sr0              0.01      0.44     0.00   0.00    0.56    38.81    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00

Read indicator:

  • r/s: Number of read operations initiated to disk per second
  • rkB/s: K bytes read per second
  • rrqm/s: How many read requests related to this device are merged per second (when the system call needs to read data, VFS will send the request to each FS. If the FS finds that different read requests read the same Block data, FS will merge this request into Merge);
  • %rrqm: Percentage of merged read requests
  • r_await: The average time required for each read operation; including not only the time of the hard disk device read operation, but also the time waiting in the kernel queue
  • rareq-sz: average read request size

Write indicators:

  • w/s: Number of write operations initiated to disk per second
  • wkB/s: K bytes written per second
  • wrqm/s: How many write requests related to this device were merged per second.
  • %wrqm: the percentage of merged write requests
  • w_await: The average time required for each write operation; including not only the time for write operations on the hard disk device, but also the time waiting in the kernel queue
  • wareq-sz: average write request size

Abandon indicators:

  • d/s: The number of abandoned requests completed by the device per second (after merger).
  • dkB/s: Number of kB discarded from the device per second
  • drqm/s: Number of merge discard requests queued to the device per second
  • %drqm: Percentage of discard requests that were merged together before being sent to the device.
  • d_await: The average time (in milliseconds) to issue an abandon request for the device to be served. This includes the time spent in queues and the time spent servicing requests.
  • dareq-sz: Average size in kilobytes of discard requests issued to the device.

Other indicators:

  • aqu-sz: average request queue length
  • %util: What percentage of a second is used for I/O operations, that is, the percentage of CPU consumed by IO, and the percentage of elapsed time for issuing I/O requests to the device (bandwidth utilization of the device). Device saturation occurs when this value approaches 100% for a device serially servicing requests. But for devices that process requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limitations. A high indicator indicates that IO is basically a bottleneck, but a low indicator does not necessarily mean that IO is not a bottleneck. Generally, if %util is greater than 70%, the I/O pressure will be relatively high. At the same time, you can check the b parameter (the number of processes waiting for resources) and the wa parameter (the percentage of CPU time occupied by I/O waiting) in combination with vmstat, which is higher than 30%. when I/O pressure is high)

Actual test cases

During testing, use the dd command to simulate disk reading and writing.

dd if=/dev/zero of=./a.dat bs=8k count=1M oflag=direct
  • if=filename: Enter the file name, which defaults to standard input. That is, specify the source file.
  • of=filename: output file name, default is standard output. That is, specify the destination file.
  • bs=bytes: Also set the read/output block size to bytes bytes.
  • count=blocks: Only blocks blocks are copied, and the block size is equal to the number of bytes specified by ibs.

The above command will write 8G of data to the a.dat file.

Open another window and use iostat to monitor.

First look at the data for iostat -kx 1.

top-interactive-help

From the wkB indicator, we know that the current write speed of the disk is around 70M. From the %util indicator, we know that the current disk usage has reached 100% and is running at full capacity. At this time, the value of iowait is about 15%.

Reviewiostat -dk 1Exit:

From thekB_read/s indicator, we know that the current writing speed of the disk is around 70M.

top-interactive-help

Reference article

https://blog.csdn.net/qq_35965090/article/details/116503427
https://www.modb.pro/db/46145
https://cloud.tencent.com/developer/article/1843341(iowait的解析)
https://www.cnblogs.com/sparkdev/p/10354947.html(stress命令)

Guess you like

Origin blog.csdn.net/qq_31442743/article/details/132278407