The official bug resolution road to be written into the MySQL source code

The author Zhou Xinjing, graduated from Zhejiang University, is currently involved in the TXSQL cloud database kernel research and development work in the CDB/CynosDB database kernel team, participated in hot row updates and a series of performance optimization work, and fixed a number of official MySQL bugs.

Part1 background

InnoDB's adaptive hash index (Adpative Hash Index, hereinafter referred to as AHI) is an index structure built on the B-tree index structure to further reduce the BTree query cost.

When searching for a record in the B-tree, you need to descend from the root node to the leaf node, and you need to use binary search for positioning in each node. The improvement of AHI lies in that it builds a hash index on the row records of the leaves frequently accessed by the BTree index, so that when performing B-tree queries, it is possible to locate the record position on the leaf node through AHI, avoiding the B-tree root The descending process from node to leaf node reduces CPU overhead.

Since the construction of AHI is an adaptive and dynamic process, it needs to be cleaned or rebuilt according to the change of query load access mode, page swapping and elimination, etc., so in essence, AHI is also a cache, and the specific construction logic There are also many articles explaining on the Internet, which are not the focus of this article.

This article will discuss a little-known AHI construction lock conflict problem and corresponding optimization.

Part2 problem

When TXSQL version 5.7 was running sysbench, we observed a very interesting phenomenon.

The experimental environment is like this. Two 96-core machines are used as sysbench client and mysql server respectively. We configure the buffer pool size to be 200GB and generate a 120GB sysbench table.

As shown in the figure below, when we execute 128 concurrent oltp_read_only loads, we observe that QPS first has a rising slope. During this time, we found that the system has a large number of read IOs and is filling the buffer pool, which is a normal state.

Then a sharp drop suddenly appeared after 100s, and the system QPS began to rise slowly after 400s, until it reached a peak after 800s.

image.png

Use the perf tool to capture the state of the system at the time of the QPS drop, and the result is as follows:

image.png

Analyzing the stack, it can be found that a large amount of CPU is spent on the lock competition of the hash table of AHI.

After careful analysis, it is not difficult to find that most of the pages at this time basically have not established AHI, and then multiple threads need to establish AHI indexes on the pages at the same time, and this construction process requires X locks on the same AHI hash table, which causes a lot of waiting .

From the perspective of QPS change, there can be an analysis as shown in the figure below:

image.png

Part3 optimization

We noticed that for a BTree index, its AHI construction occurs after the BTree leaf node is positioned, and the corresponding call chain is as follows:

btr_cur_search_to_nth_level→ btr_search_info_update→ btr_search_info_update_slow→ btr_search_build_page_hash_index

In btr_search_info_update_slow, a decision is made based on statistical information, and btr_search_build_page_hash_index is called to add the records of the current page to the hash table of AHI. This process requires an exclusive X lock of the hash table.

Since only one thread can modify the hash table, it is quite unwise for other concurrently constructing AHI threads to wait for the X lock of this hash table, because the block lives the critical path of the query, and only one thread is doing the construction work. .

At the same time, we noticed that AHI is only an auxiliary cache, and its practical BTree can also handle queries correctly.

So naturally, we can think of the following optimization methods:

  1. When we analyze the BTree query path and decide to build an AHI index for a page, we first check whether the lock of the hash table corresponding to the BTree is held by other threads to write the lock;

  2. If the write lock is held, we cancel the AHI index construction task for the page this time, wait for the next time the page is accessed again and try to build again, and fallback to the normal BTree query.

Part4 specific implementation

From an implementation point of view, it is actually very simple: when btr_search_info_update_slow judges to establish an AHI index for a page of records based on statistical information, we add a conditional judgment: if there is currently a concurrent AHI construction thread that holds the X lock of the hash table, we Just return directly.

The code is only a few lines, roughly as follows:

image.png

Someone may worry that skipping this way will affect the correctness of the code?

The answer is no, because we have not cleared any statistical information about AHI on this page, but postponed the construction time, that is, postponed until the hash table lock conflict is not serious.

Part5 effect

After applying the above optimization, we re-execute the above experiment and get the following result chart:

image.png

Among them, the red line (enable AHI+Contention Avoidance optimization) is the result of the above optimization. After about 100s of warm-up, the performance is stable and the lock bottleneck disappears.

Part6 source of inspiration

In fact, there is already a similar optimization on the original AHI query path:

Before executing AHI query in btr_cur_search_to_nth_level, if it is found that the hash table of AHI is locked by other thread X, directly fallback to BTree query.

The optimization considerations here are similar: instead of waiting for the X lock of the AHI hash table, it is better to go directly to the btree search. The cost is likely to be lower than waiting for the X lock, and the concurrency is higher.

image.png

Part7 summary

This optimization is currently online in the latest version of TXSQL5.7, which will effectively alleviate the lock competition problem created by AHI. Possible scenarios include but are not limited to: system startup, AHI switch just turned on, and active/standby switching, all pages have no AHI yet Record, high concurrency may lead to a lot of AHI construction work.

At the same time, we verified that this problem exists in the latest versions of official MySQL 5.7 and 8.0, so we have also contributed this optimization idea to the official, https://bugs.mysql.com/bug.php?id=100512 , Is currently under evaluation and I believe it will be integrated into the main line soon.

Guess you like

Origin blog.51cto.com/14992250/2551638