Problem phenomenon
The ssd disk (as a cache disk) io util utilization rate is close to 100% in the cluster, and it is necessary to determine whether the disk bottleneck is reached.
Problem investigation
Refer to the iostat documentation, as mentioned in other people's blogs, that iostat's utils does not reflect new devices that support concurrency, such as ssd, raid, etc.:
%util
Percentage of elapsed time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is
close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not
reflect their performance limits.
In addition, you can refer to the article "Two traps in iostat: %util and svctm" https://brooker.co.za/blog/2014/07/04/iostat-pct.html
Also tried it on x86, by reading the cache disk:
fio -name test -rw randread -filename /dev/sdr -runtime 60 -time_based=1 -direct=1 -ioengine libaio -numjobs=1 -iodepth=1 -eta-newline=1
read: IOPS=7872, BW=30.8MiB/s (32.2MB/s)(236MiB/7671msec)
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 9.00 7524.00 394.00 29.59 2.37 8.27 0.42 0.18 0.13 1.14 0.12 95.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 0.00 8305.00 82.00 32.66 0.96 8.21 0.01 0.12 0.11 0.26 0.12 99.80
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 0.00 8379.00 62.00 32.80 0.60 8.10 0.02 0.11 0.11 0.10 0.12 99.10
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 8.00 7720.00 195.00 30.53 1.67 8.33 0.15 0.14 0.12 0.86 0.12 98.10
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 4.00 8021.00 87.00 31.33 0.72 8.10 0.02 0.12 0.12 0.17 0.12 99.20
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 5.00 7502.00 374.00 29.31 1.89 8.11 0.31 0.16 0.13 0.92 0.12 96.00
You can see that iops is around 7000, and utils is already 100%; but if you increase iodepth:
fio -name test -rw randread -filename /dev/sdr -runtime 60 -time_based=1 -direct=1 -ioengine libaio -numjobs=1 -iodepth=128 -eta-newline=1
read: IOPS=129k, BW=505MiB/s (530MB/s)(18.1GiB/36746msec)
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 22.00 125637.00 667.00 491.68 13.65 8.19 10.29 0.35 0.34 1.90 0.01 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 9.00 131418.00 136.00 513.60 1.39 8.02 1.59 0.24 0.24 0.52 0.01 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 0.00 131817.00 43.00 514.91 0.34 8.00 0.23 0.21 0.21 0.19 0.01 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 1.00 132226.00 81.00 517.15 1.23 8.02 0.67 0.23 0.23 0.63 0.01 100.00
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdr 0.00 0.00 130618.00 37.00 510.23 0.32 8.00 0.12 0.21 0.21 0.65 0.01 100.00
You can see that iops 12w and utils are also 100%.
I also tested the nvme device, corresponding to 4k-1-1, read is only 1w, if it is 4k-1-128 read has 25w, but their utils are 100%.
problem causes
%util, it means "Percentage of CPU time during which I/O requests were issued to the device".
The util parameter means "the percentage of CPU time during which I/O requests are issued to the device". This value can reflect the busyness of a device for serial devices (such as HDD mechanical disks). If it is running for a long time greater than 90 %, it may be slow. For parallel devices (such as SSDs, solid state disks), if one IO occupies 100% of the CPU time in a certain period of time, then 100 IOs in the same period of time may also occupy 100% of the CPU time, so it cannot be considered that the device has reached a bottleneck.
To judge a certain disk device, other parameters such as await value, avgqu-sz and so on need to be analyzed.
summary of a problem
The reference value of io util% is only accurate for HDD, because HDD is not concurrent and cannot reflect the capability of SSD (concurrent device).