Linux disk io monitoring

When we troubleshoot online linux servers, we generally use commands such as top, free, netstat, df -h to troubleshoot cpu, memory, network, and disk problems. Sometimes we need to learn more about the usage of disk io, so this article focuses on how to view the disk io information of linux.

1. iostat:

1. Basic usage:

$iostat -d -k 1 10
1) The parameter -d indicates that the device (disk) usage status is displayed; -k some columns that use block as the unit are forced to use Kilobytes as the unit; 1 10 indicates that the data display is refreshed every 1 second, a total of 10 times.


2) Meaning:

  • tps: The number of transfers per second that was issued to the device. (Indicate the number of transfers per second that were issued to the device.). "One transfer" means "one I/O request". Multiple logical requests may be combined into "one I/O request". The size of the "one transfer" request is unknown.
  • kB_read/s: The amount of data read from the device (drive expressed) per second;
  • kB_wrtn/s: The amount of data written to the device (drive expressed) per second;
  • kB_read: the total amount of data read;
  • kB_wrtn: The total amount of data written; these units are Kilobytes.

In the above example, we can see the statistics of the disk sda and its various partitions. The total TPS of the disk at that time was 39.29, and the following is the TPS of each partition. (Because it is an instantaneous value, the total TPS is not strictly equal to the sum of the TPS of each partition)

3) Specify the monitored device name as sda:

iostat -d sda 2

2. -x parameter:

1) Using the -x parameter we can get more statistics.


2) Meaning:

  • rrqm/s: How many read requests related to this device are merged per second (when the system call needs to read data, VFS sends the request to each FS, if the FS finds that different read requests read the same Block data, FS will merge this request with Merge);
  • wrqm/s: How many write requests related to this device are merged per second;
  • rsec/s: the number of sectors read per second;
  • wsec/: The number of sectors written per second.
  • rKB/s:The number of read requests that were issued to the device per second;
  • wKB/s:The number of write requests that were issued to the device per second;
  • avgrq-sz Average requested sector size
  • avgqu-sz is the average request queue length. There is no doubt that the shorter the queue length, the better.    
  • await: The average processing time of each IO request (in microseconds and milliseconds). This can be understood as the IO response time. Generally, the system IO response time should be less than 5ms, and if it is greater than 10ms, it will be relatively large. This time includes the queue time and service time. That is to say, in general, await is greater than svctm. problem.
  • svctm represents the average service time (in milliseconds) per device I/O operation. If the value of svctm is close to await, it means that there is almost no I/O waiting, and the disk performance is good. slow down.
  • %util: All processing IO time in the statistical time, divided by the total statistical time. For example, if the statistics interval is 1 second, the device is processing IO for 0.8 seconds and idle for 0.2 seconds, then the device's %util = 0.8/1 = 80%, so this parameter implies how busy the device is.

In general, if this parameter is 100%, it means that the device is running at full capacity (of course, if it is multi-disk, even if %util is 100%, because of the concurrency capability of the disk, the disk usage may not be the bottleneck).

3. -c parameter:

iostat can also be used to get some cpu status values:

iostat -c 1 10
avg-cpu: %user %nice %sys %iowait %idle
1.98 0.00 0.35 11.45 86.22
avg-cpu: %user %nice %sys %iowait %idle
1.62 0.00 0.25 34.46 63.67

4. Common methods:

iostat -d -k 1 10 #View TPS and throughput information (disk read and write speed in KB)
iostat -d -m 2 #View TPS and throughput information (disk read and write speed in MB)
iostat -d -x -k 1 10 #View device usage (%util), response time (await) iostat -c 1 10 #View cpu status

Example:

iostat -d -x -k 1
Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda 1.56 28.31 7.84 31.50 43.65 3.16 21.82 1.58 1.19 0.03 0.80 2.61 10.29
sda 1.98 24.75 419.80 6.93 13465.35 253.47 6732.67 126.73 32.15 2.00 4.70 2.00 85.25
sda 3.06 41.84 444.90 54.08 14204.08 2048.98 7102.04 1024.49 32.57 2.10 4.21 1.85 92.24
It can be seen that the average response time of the disk is <5ms, and the disk usage is >80. The disk is responding normally, but is already busy.

ostat -d -k 1 |grep sda10
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda10 60.72 18.95 71.53 395637647 1493241908
sda10 299.02 4266.67 129.41 4352 132
sda10 483.84 4589.90 4117.17 4544 4076
sda10 218.00 3360.00 100.00 3360 100
sda10 546.00 8784.00 124.00 8784 124
sda10 827.00 13232.00 136.00 13232 136
As seen above, the average number of disk transfers per second is about 400; disk reads are about 5MB per second and writes are about 1MB.


2. iotop:

1. iotop is a disk I/O monitoring tool written in python with a top interface. It can be installed through yum install iotop, after running the command as shown in the figure:


  • The DISK READ and DISK WRITE fields represent the I/O bandwidth of the block device during the sampling time;
  • The SWAPIN and IO fields indicate the time the current process or thread spends swapping pages in and waiting for I/O;
  • The PRIO field indicates the I/O priority;
  • Total DISK READ and Total DISK WRITE indicate the total I/O read and write conditions

2. Common parameters:

-o --only Show only processes or threads that actually have I/O operations. Can be controlled by shortcut key o

-b --batch non-interactive mode, can be used to save output results-

n refresh times -

d refresh interval time

-P only display the process, not the thread-

p monitor the specified process or thread-

k I/O bandwidth is expressed in KB. By default, iotop uses B/s, K/s, M/s to represent I/O bandwidth.

-u Monitor the I/O operations of the specified user

-t Add timestamps to each line of output

-q Display column names only at the first output -qq Do

not display column names

-qqq Do not display total I/O information

$ sudo iotop -b -d 1 -n 5  -o  -u mongod -P  -p 1524  -qqq
 1524 be/4 mongod      0.00 B/s  527.27 K/s  0.00 %  0.54 % mongod -f /etc/mongod.conf
 1524 be/4 mongod      0.00 B/s  566.84 K/s  0.00 %  1.40 % mongod -f /etc/mongod.conf
 1524 be/4 mongod      0.00 B/s  766.42 K/s  0.00 %  0.70 % mongod -f /etc/mongod.conf
 1524 be/4 mongod      0.00 B/s  946.21 K/s  0.00 %  0.37 % mongod -f /etc/mongod.conf


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324451425&siteId=291194637