1. System environment
OS: CentOS Linux release 7.8.2003 (Core)
Kernel: 3.10.0-1127.19.1.el7.x86_64
MySQL: Both 5.0 and 5.7 have this problem, it should have nothing to do with the version
2. Pressure measurement tools
benchyou[1]
mysql_random_load[2]
3. Problem phenomenon
When using the mysql_random_load tool to connect to MySQL to write data, the performance is very very low.
Since the mysql_random_load tool does not support socket connection, I had to give up and use benchyou instead . By the way, benchyou and sysbench very similar, but also very easy to use.
After switching to the benchyou tool, the pressure test is normal. It does not seem to be a problem with the MySQL version.
When using the mysql_random_load tool to perform a stress test, the system load is very high, and it can be observed that the system interruption is also very high and uneven.
[[email protected]]# vmstat -S m 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 73585 2 41051 0 0 117 91 4 2 0 0 99 0 0
2 0 0 73585 2 41051 0 0 0 28320 55444 100207 18 2 80 0 0
4 0 0 73584 2 41052 0 0 0 1936 52949 98607 18 2 81 0 0
2 0 0 73583 2 41052 0 0 0 4864 56375 101262 14 2 84 0 0
4 0 0 73583 2 41052 0 0 0 29064 55806 103715 19 2 80 0 0
5 0 0 73583 2 41052 0 0 0 5704 55854 98386 15 2 83 0 0
You can see that the value of the system.in column is very high. After changing to the benchyou tool, the value of this column dropped from 55,000 to 16,000.
[[email protected]]# vmstat -S m 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
4 0 0 77238 2 38371 0 0 118 88 2 3 0 0 99 0 0
2 0 0 77234 2 38374 0 0 0 31620 16039 77988 3 2 95 0 0
2 0 0 77231 2 38377 0 0 0 31996 16091 78926 3 2 95 0 0
3 0 0 77229 2 38378 0 0 0 33028 16347 81006 3 2 95 0 0
0 0 0 77226 2 38383 0 0 0 52412 15496 75715 3 2 95 0 0
2 0 0 77224 2 38384 0 0 0 32252 16167 79352 3 2 95 0 0
Let's look at the system interruption performance when there is a problem
[[email protected]]# mpstat -I SUM -P ALL 1
Linux 3.10.0-1127.19.1.el7.x86_64 (yejr.run) 09/28/2020 _x86_64_ (32 CPU)
05:37:40 PM CPU intr/s
05:37:41 PM all 51833.00
05:37:41 PM 0 2069.00
05:37:41 PM 1 1159.00
05:37:41 PM 2 2979.00
05:37:41 PM 3 1580.00
05:37:41 PM 4 1627.00
05:37:41 PM 5 1461.00
05:37:41 PM 6 1243.00
05:37:41 PM 7 1825.00
05:37:41 PM 8 2154.00
05:37:41 PM 9 1367.00
05:37:41 PM 10 1277.00
05:37:41 PM 11 1376.00
05:37:41 PM 12 4085.00
05:37:41 PM 13 1601.00
05:37:41 PM 14 4045.00
05:37:41 PM 15 1857.00
05:37:41 PM 16 1692.00
05:37:41 PM 17 722.00
05:37:41 PM 18 118.00
05:37:41 PM 19 1862.00
05:37:41 PM 20 1637.00
05:37:41 PM 21 1130.00
05:37:41 PM 22 1750.00
05:37:41 PM 23 1653.00
05:37:41 PM 24 1417.00
05:37:41 PM 25 1547.00
05:37:41 PM 26 1500.00
05:37:41 PM 27 1033.00
05:37:41 PM 28 20.00
05:37:41 PM 29 1683.00
05:37:41 PM 30 888.00
05:37:41 PM 31 1549.00
It can be seen that the total number of interrupts per second is 55,000, but the multiple CPUs are not balanced.
4. Problem analysis
It was initially determined that the write performance was poor due to high system interrupts, and it was also determined that this problem was caused by the imbalance of interrupts among multiple CPUs.
Observe which interrupts are relatively high, and find that LOC and RES have a relatively large increase per second.
[[email protected]]# watch -d cat /proc/interrupts
...
LOC: 2468939840 2374791518 2373834803 2373613050 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring interrupts
IWI: 50073298 45861632 45568755 45833911 IRQ work interrupts
RTR: 0 0 0 0 APIC ICR read retries
RES: 3472920231 3022439316 2990464825 3012790828 Rescheduling interrupts
CAL: 5131479 6539715 17285454 11211131 Function call interrupts
TLB: 23094853 24045725 24230472 24271286 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
...
After trying to modify the CPU bound by the relevant interrupt number (refer to: SMP affinity and proper interrupt handling in Linux [3]), the problem is still not alleviated.
Later, a mysterious boss gave some guidance and found out that it turned out to be a kernel bug, involving the parameter kernel.timer_migration , which needs to be set to 0.
[[email protected]]# sysctl -w kernel.timer_migration=0
Of course, it is best to write it persistently to the /etc/sysctl.conf file.
[[email protected]]# cat /etc/sysctl.conf
kernel.timer_migration=0
#加载配置文件使之生效
[[email protected]]# sysctl -p
Use the mysql_random_load tool to perform a stress test again and it will be fine.
The following is a description of the bug
The bug is when linux os receive too many tcp packages,
and the tcp may add too many timer, but in
get_target_base->get_nohz_timer_target it check current
cpu is idle, sometimes thouth the current core is very busy,
but the idle_cpu result is 1, in this time
if set kernel.timer_migration=1 ,the timer will be move to next cpu.
bug详情见:Bug 124661 - kernel.timer_migration=1 cause too many Rescheduling interrupts[4]
Finally, it is worth mentioning that modifying this parameter on the cloud host should not work unless the physical machine is modified. I ran into a similar problem when running the mysql_random_load tool pressure test on a cloud host. The problem persisted after modifying the kernel parameters.
In-text link
[1]:https://github.com/xelabs/benchyou
[2]:https://github.com/Percona-Lab/mysql_random_data_load
[3]:http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux
[4]:https://bugzilla.kernel.org/show_bug.cgi?id=124661
The full text is over.
Enjoy Linux & MySQL :)
"MySQL Core Optimization" course has been upgraded to MySQL 8.0, scan the code to start the journey of MySQL training