Clickhouse high IO performance bottlenecks investigation

A few days ago the company clickhouse have a very slow query. Managers have been asking why cpu is not enough to slow or IO occupy too high, or other reasons. So with the following article troubleshooting performed, troubleshoot performance without considering the optimization of sql

 

 1, the first is a panacea top third row CPU Information Statistics: 

% People (s): 0.3 us, sy 0.2, 0.0 ni, 99.5 id, 0.0 times, 0.0 hi, si 0.0, 0.0 st

`` `
 Cpu (S):  
  0.3% US: user space occupied by the percentage of CPU
  0.2% SY: core (system) the space occupied by the percentage of CPU
  0.0% Ni: the user process space priority process changed CPU-percentage
  99.7% id: idle CPU percentage of
  0.0% wa: waiting for input and output of the CPU time percentage of
  0.0% hi: hardware CPU interrupts occupancy percentage of
  0.0% si: soft interrupt occupancy percentage of
  0.0% st: virtual machine (virtualization technology) occupancy percentage

KiB Mem: 13142010 + total , Free 54,558,996, 49,383,624 used, 27,477,484 BUFF / cache
  1004768k total: total physical memory
  463092k used: total amount of physical memory used
  541676k free: free amount of memory
  64316k buffers: the amount of memory used as the kernel cache

KiB Swap: 67108860 total, 64400272 free, 2708588 used. 80281728 avail Mem

  694268k total: total exchange zone
  0 k used: the total amount used swap
  694268k free: total exchange area idle
  224884k cached: the total area of the buffer exchanged
`

sql not performed 

When executed sql 

 


The above comparison, we can see clickhouse Although this process cpu usage is 200 percent, accounting for almost used 2core, but the server only 4.4 total cpu utilization (server 48core). Free memory is relatively high.
Then there is a relatively high wa, wa corresponding to the percentage of CPU time waiting for input and output.
So guess io is higher (in fact already guessed io high) so with the highlight of pidstat iostat

 2、pidstat


Under clickhouse to clear the cache.

# Clickhouse linux is cached in the page cache 

Sync 
echo 3> / proc / SYS / vm / drop_caches 

 

meaning of # commands: display I / O statistics, updated every second 
pidstat -d 1

  

When not execute sql

11:24:33 UID PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 24 of 5 34 is seconds 990 9048 26025.93 0.00 0.00 Java 
. 11 when 24 of 5 34 is seconds 990 442 735 10540.74 0.00 0.00 du 

. 11 when 24 of 5 34 is seconds UID PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 24 of 5 35 seconds 9,909,048 446,176.00 0.00 0.00 Java 

. 11 when 24 of 5 35 seconds the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 24 of 5 36 seconds 9,909,048 22112.00 0.00 0.00 Java 

. 11 when 24 of 5 36 seconds UID PID kB_rd / s kB_wr / s kB_ccwr / s Command 
. 11 when 24 of 5 37 [seconds 990 9048 14436.00 0.00 0.00 Java 

. 11 when 24 of 5 37 [seconds UID PID kB_rd / s kB_wr / s kB_ccwr / s Command 
11:24:38 990 9048 20964.00 0.00 0.00 Java 

11:24:38 the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 24 of 5 39 seconds 990 9048 19136.00 0.00 0.00 Java 
. 11 when 24 of 5 39 seconds 0247761 Presto-Server 8.00 0.00 0.00 

`` `


Performed pidstat when executed sql, the command shows the results of 
` 
11 21 minutes 47 seconds the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 time 21 minutes 48 seconds 990 9048 12381.48 0.00 0.00 Java 
. 11 when 21 minutes 48 seconds 983 124101 0.00 3.70 0.00 Java 
. 11 when 21 is of 5 48 seconds 963 348889 153,166.67 3.70 0.00 clickhouse-Serv 

. 11 when 21 is of 5 48 seconds the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 21 is of 5 49 seconds 9,909,048 22996.00 0.00 0.00 Java 
. 11 when 21 is of 5 49 seconds 962 78 518 356.00 0.00 0.00 zabbix_agentd 
. 11 when 21 is of 5 49 seconds 963 348889 93948.00 0.00 0.00 clickhouse-Serv 

. 11 when 21 is of 5 49 seconds the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 21 is of 5 50 seconds 0 1323 0.00 16.00 0.00 jbd2 / SDB-. 8 
. 11 when 21 is of 5 50 seconds 0 4251 0.00 12.00 0.00 Java 
. 11 when 21 is of 5 50 seconds 990 9048 17576.00 0.00 0.00 Java 
. 11 when 21 is of 5 50 seconds 963 348889 150992.00 4.00 0.00 clickhouse-serv

11:21:50 the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 21 is of 5 51 is seconds 990 9048 24484.00 0.00 0.00 Java 
. 11 when 21 is of 5 51 is seconds 963 348889 151,164.00 0.00 0.00 clickhouse-Serv 

. 11 when 21 is of 5 51 is seconds the UID the PID kB_rd / S kB_wr / S kB_ccwr / S the Command 
. 11 when 21 is of 5 52 is seconds 990 9048 16096.00 0.00 0.00 Java 
. 11 when 21 is of 5 52 is seconds 1000 148 747 0.00 8.00 0.00 mysqld 
. 11 when 21 is of 5 52 is seconds 963 348889 175228.00 0.00 0.00 clickhouse-serv

  

Obviously clickhouse read disk io up to the 175228.kB / s converted to 171M / s. But here I clickhouse data directory is assigned to seven disk, 7 disk read speed of 171M / s is not very high, single disks can perform well and 120M / s. But this time the performance bottleneck is almost possible to determine the IO

3、iostat

First clear the cache

Use the command "iostat -x 1" to determine which disk device high IO load: 
iostat -x 1

 

 

Hanging on the side of I data1 2 3 4 5 6 7

df -l

ev/sda1 1038336 145704 892632 15% /boot
/dev/sdf 2306651404 460825460 1728631192 22% /data5
/dev/sdh 2306651404 1097399924 1092056728 51% /data7
/dev/sdb 2306651404 1086978144 1102478508 50% /data1
/dev/sde 2306651404 796331792 1393124860 37% /data4
/dev/sdd 2306651404 899894556 1289562096 42% /data3
/dev/sdg 2306651404 628524548 1560932104 29% /data6
/dev/sdc 2306651404 1796066508 393390144 83% /data2

  

未执行sql时

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    5.00    8.00    13.00    56.00    10.62     0.01    0.69    1.80    0.00   0.69   0.90
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.37   46.75   46.75    0.00   6.38   5.10
sdd               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.51   64.00   64.00    0.00   8.62   6.90
sde               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.17   21.75   21.75    0.00   5.38   4.30
sdf               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.08    9.62    9.62    0.00   2.00   1.60
sdh               0.00     0.00   64.00    0.00  1812.00     0.00    56.62     0.14    2.12    2.12    0.00   1.28   8.20
sdg               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.12   15.00   15.00    0.00   2.00   1.60
sdi               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.13   16.75   16.75    0.00   2.25   1.80
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdl               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.16   19.88   19.88    0.00   2.62   2.10
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdq               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdr               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.20   25.62   25.62    0.00   6.12   4.90
sds               0.00     0.00    1.00    0.00    24.00     0.00    48.00     0.02   16.00   16.00    0.00  16.00   1.60
sdt               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    5.00    8.00    13.00    56.00    10.62     0.01    0.69    1.80    0.00   0.69   0.90
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.52    0.00    0.29    0.06    0.00   99.12

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdb               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.19   23.50   23.50    0.00   3.00   2.40
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdf               0.00     0.00    1.00    0.00   128.00     0.00   256.00     0.00    0.00    0.00    0.00   0.00   0.00
sdh               0.00     0.00   13.00    0.00  2432.00     0.00   374.15     0.12    9.31    9.31    0.00   3.46   4.50
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdi               0.00     0.00    1.00    0.00   128.00     0.00   256.00     0.00    0.00    0.00    0.00   0.00   0.00
sdj               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.14   17.12   17.12    0.00   2.25   1.80
sdk               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.15   19.00   19.00    0.00   2.50   2.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdn               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.17   21.38   21.38    0.00   2.88   2.30
sdo               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.33   40.75   40.75    0.00   5.88   4.70
sdm               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.14   18.00   18.00    0.00   2.38   1.90
sdp               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.11   13.75   13.75    0.00   1.88   1.50
sdq               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.09   11.25   11.25    0.00   1.62   1.30
sdr               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sds               0.00     0.00    9.00    0.00  1828.00     0.00   406.22     0.20   22.78   22.78    0.00   2.67   2.40
sdt               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.01    1.50    1.50    0.00   0.38   0.30
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

  


执行sql时:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00  448.00    2.00 15168.00     8.00    67.45     0.85    1.88    1.89    0.00   1.82  81.90
sdb               0.00     0.00 1481.00    0.00 190052.00     0.00   256.65    42.70   23.77   23.77    0.00   0.68 100.30
sdc               0.00     0.00    1.00    0.00   128.00     0.00   256.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.07    8.75    8.75    0.00   4.12   3.30
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdg               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.16   19.75   19.75    0.00   5.12   4.10
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdj               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.37   46.38   46.38    0.00   7.62   6.10
sdk               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.19   24.00   24.00    0.00   3.12   2.50
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdn               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdo               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.13   16.12   16.12    0.00   5.50   4.40
sdm               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.01    1.50    1.50    0.00   0.25   0.20
sdp               0.00     0.00   10.00    0.00  2096.00     0.00   419.20     0.21   21.10   21.10    0.00   5.00   5.00
sdq               0.00     0.00 3869.00    0.00 15476.00     0.00     8.00     2.98    0.77    0.77    0.00   0.20  77.50
sdr               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.27   33.25   33.25    0.00   6.88   5.50
sds               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.20   24.50   24.50    0.00   3.12   2.50
sdt               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00  448.00    2.00 15156.00     8.00    67.40     0.85    1.89    1.90    0.00   1.82  81.80
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           9.43    0.00    2.47   29.14    0.00   58.96

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00  699.00    0.00  9424.00     0.00    26.96     0.50    0.72    0.72    0.00   0.59  40.90
sdb               0.00     4.00 1415.00    4.00 182672.00    32.00   257.51    49.25   37.00   37.11    0.00   0.71 100.30
sdc               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.02    2.50    2.50    0.00   0.50   0.40
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.17   21.12   21.12    0.00   2.88   2.30
sdf               0.00     0.00   19.00    0.00  4116.00     0.00   433.26     0.32   16.84   16.84    0.00   3.11   5.90
sdh               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.34   42.00   42.00    0.00   5.88   4.70
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdi               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.41   50.62   50.62    0.00   6.88   5.50
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdk               0.00     0.00    1.00    0.00   128.00     0.00   256.00     0.00    0.00    0.00    0.00   0.00   0.00
sdl               0.00     0.00    8.00    0.00  2048.00     0.00   512.00     0.31   38.88   38.88    0.00   5.25   4.20
sdn               0.00     0.00    6.00    0.00  1508.00     0.00   502.67     0.25   42.00   42.00    0.00   7.67   4.60
sdo               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdp               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdq               0.00     0.00 3989.00    0.00 17968.00     0.00     9.01     1.58    0.40    0.40    0.00   0.19  77.40
sdr               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

  

 

很明显 sdb 对于的磁盘 使用率达到了 100% 读的速度为 155756kB/s 对于 152M/s

/dev/sdb 2306651404 1086978144 1102478508 50% /data1

data1 就是我们的一块数据盘。

这下就时很明显,改sql 查询慢的瓶颈就是IO

Guess you like

Origin www.cnblogs.com/chouc/p/11331085.html